Ordering of the 'individuals'
Many basic statistical methods assume that the 'individuals' in a data matrix are unordered — any rearrangement of the rows would give the same information. For example, the weights of 20 cows sampled from a herd would form an unordered data set.
However sometimes the rows of the data matrix are ordered, usually by time. For example, temperature measurements may be recorded at 10-minute intervals between 9am and 9pm. The resulting temperatures are a continuous numerical variable whose values are time-ordered — the ordering of the values holds useful information that will help us understand the data. These kinds of data are called time series.
For a preliminary exploration of ordered data, it is often useful to examine them as though they were unordered, but a full analysis should take account of the ordering.
Examples
In both of the data sets below, the data were time-ordered. In the grape testing example, the bunches of grapes were tested in the order shown, whereas in the weather example, the annual rainfalls were recorded annually between July 2001 to June 2013.
|
|
If the grape-testing experiment was conducted in an identical way for each bunch of grapes, it would be possible to ignore the ordering of the data and analyse the data as though they were unordered.
This would also be a reasonable initial analysis of these weather data, but if the data had been collected over a longer period when there may have been a trend in rainfall (e.g. from global warming), it would be more important to take account of the ordering of the values.
To include the time-ordering of the data in the data matrix, a new variable can be added, as shown below.
|
|
The 'time' variable contains all information about the ordering of the data so the data matrix can otherwise be treated as 'unordered'.