Information from the variation in data
Variation in data is not simply an annoyance — the variation itself can hold important information. An important role of statistics is to display and describe this variation in ways that highlight the information in it.
Rain days
The table below shows the number of rainy days in an African village each year between 1994 and 2013.
|
|
|
Since any systematic change in climate (if it exists) is much smaller than the random year-to-year variation in the data, we will ignore the fact that the data are a time series and treat them as an unordered set of values.
101 | 92 | 119 | 63 | 74 |
54 | 93 | 111 | 72 | 68 |
91 | 109 | 101 | 74 | 92 |
95 | 60 | 53 | 89 | 104 |
What can you see?
There is clearly variability between years and a quick scan shows that all values are between 53 and 119 days. But what else can be easily learned from the table?
Sorting the data can help
It is not easy to obtain further useful information from a table of raw data. Different displays of the data may however highlight meaningful patterns. Graphical displays are usually most effective, but even sorting the data into order gives some insight into the values.
The list below initially shows the rain days in time order. It is difficult to see any unusual features in the raw data.
Drag the slider to the right to sort the data into increasing order, then look for features in the sorted list of values.
Perhaps the two clusters correspond to different types of year? El Nino and La Nina? This analysis suggests further investigation by the researcher.