Information from the variation in data

Variation in data is not simply an annoyance — the variation itself can hold important information. An important role of statistics is to display and describe this variation in ways that highlight the information in it.

Yam growth data

The table below shows the growth (cm) of the main stalks of 20 yam plants over a period of seven days.

Growth of yam plants (cm)
10.1 9.2 11.9 6.3 7.4
5.4 9.3 11.1 7.2 6.8
9.1 10.9 10.1 7.4 9.2
9.5 6.0 5.3 8.9 10.4

What can you see?

There is clearly variability between yam plants and a quick scan shows that all values are between 5 and 12 cm. But what else can be easily learned from the table?

Sorting the data can help

It is not easy to obtain further useful information from a table of raw data. Different displays of the data may however highlight meaningful patterns. Graphical displays are usually most effective, but even sorting the data into order gives some insight into the values.

The list below again shows the yam growth data. Firstly, examine the unordered list of values. It is difficult to see any unusual features in the raw data.

Drag the slider to the right to sort the data into increasing order, then look for features in the sorted list of values.

Perhaps the two clusters correspond to different varieties of yam? Or yams grown in different types of soil? This analysis suggests further investigation by the researcher.