Scatterplots of all pairs of variables

The problem of displaying relationships becomes even more difficult when there are more than three variables. It is possible to gain some insight into the relationships between the variables with scatterplots of all pairs of variables. If these scatterplots are arranged into an array, they are called a scatterplot matrix.

Brushing

Although a static display of a scatterplot matrix reveals some aspects of the relationships between the variables, more insight into the data is obtained by adding dynamic features.

All scatterplots are usually dynamically linked, so that clicking on a cross on one scatterplot highlights that individual in all scatterplots. This is often extended to allow highlighting of multiple crosses on a scatterplot with a 'brush' tool. Again, the crosses corresponding to these individuals are highlighted on all scatterplots. This is called brushing.

Dynamic linking of scatterplots and brushing are more easily explained with an example.

Iris data

The scatterplot matrix below shows measurements of the sepals (length and width) and petals (length and width) of 150 irises that were collected from the Gaspe Peninsula.

Various features are apparent ...

Although we detected clusters in all of the scatterplots in the scatterplot matrix above, it is not clear whether the irises in the cluster with low petal length and width are the same irises as those in the cluster of sepal length and width.

Drag with the mouse over the crosses on one scatterplot. Observe that the separate clusters of irises on the different plots correspond to largely the same irises.

Click the checkbox Use brush on the scatterplot matrix, then drag with the mouse over the crosses in the cluster with small petals on the petal length/width plot. When all crosses in this cluster have been highlighted, look at the plot of sepal length and width (on the top left).