Ordering of the 'individuals'
Many basic statistical methods assume that the 'individuals' in a data matrix are unordered — any rearrangement of the rows would give the same information. For example, the weights of 20 loaves of bread sampled from a supermarket would form an unordered data set.
However sometimes the rows of the data matrix are ordered, usually by time. For example, the number of admissions at a hospital accident and emergency clinic might be recorded each day for a year. The resulting data are a discrete numerical variable whose values are time-ordered — the ordering of the values holds useful information that will help us understand the data. These kinds of data are called time series.
For a preliminary exploration of ordered data, it is often useful to examine them as though they were unordered, but a full analysis should take account of the ordering.
Examples
In both of the data sets below, the data were time-ordered. In the camp site example, the number of tents was recorded on 12 successive days, whereas in the second example, the insurance claims were recorded annually between July 2001 to June 2013.
|
|
It would be possible to ignore the ordering of the camp-site data and perform a preliminary analysis as though they were unordered. (But this would ignore the fact that some days were at weekends.)
This may also be a reasonable initial analysis of the insurance claim data, but it is more important to take account of the possibility of an overall increase in shipping over the period.
To include the time-ordering of the data in the data matrix, a new variable can be added, as shown below.
|
|
The 'time' variable contains all information about the ordering of the data so the data matrix can otherwise be treated as 'unordered'.