Relationships between three or more variables
The relationship between two variables is captured completely by a scatterplot. However there is no comparable display of three or more variables that clearly expresses their relationships.
Various displays have been proposed for multivariate data, each of which extends the scatterplot in a different way. None of them have the same descriptive power that a simple scatterplot has for bivariate data but, with experience, they can give some insight into the relationships between the variables. The pages in this section describe some such displays.
Use different plotting symbols to represent a third variable
For data sets that contain three numerical variables, the simplest display is based on a scatterplot of two of the variables. The third variable is represented by the use of differing symbols instead of identical 'crosses' on the scatterplot. A continuous characteristic of the plotting symbols such as colour, size or angle is often used.
The choice of which of the three variables to represent using the plotting symbol is often arbitrary, but can greatly affect how easily the diagram is interpreted.
Although this kind of scatterplot is easy to draw, it is usually hard to interpret.
Managerial salaries
The personnel department of a large company is examining the salaries of its mid-level executives. The scatterplot below shows the weekly salaries of the 32 executives and their ages.
The number of months that the executives have been employed by the company is shown by the size of the circle in the scatterplot (the larger circles represent longer service).
For any specific age, the salary tends to be higher for those who have had a shorter length of employment with the company. Presumably recent employees have been attracted with higher salaries!
Select the options Colour and Angle from the popup menu to use different characteristics of the plotting symbols to represent the length of employment.