Information from the variation in data

Variation in data is not simply an annoyance — the variation itself can hold important information. An important role of statistics is to display and describe this variation in ways that highlight the information in it.

Software support time

An software company offers free technical support for one product (an expensive accounting package) via email. In order to monitor the ongoing costs of this service, the company records the time taken to deal with each emailed query. The table below shows the number of minutes taken to answer all of the 20 queries in one week. The data are presented (by column) in the order in which the queries arrived, but this ordering is not thought to be important.

Time to process query (min)
61 52 79 23 34
14 53 71 32 28
51 69 61 34 52
55 20 13 49 64

What can you see?

There is clearly variability between samples and a quick scan shows that all values are between 0 and 100 minutes. But what else can be easily learned from the table?

Sorting the data can help

It is not easy to obtain further useful information from a table of raw data. Different displays of the data may however highlight meaningful patterns. Graphical displays are usually most effective, but even sorting the data into order gives some insight into the values.

The list below shows the technical support data. Firstly, examine the unordered list of values. It is difficult to see any unusual features in the raw data.

Drag the slider to the right to sort the data into increasing order, then look for features in the sorted list of values.

Perhaps the two clusters correspond to different types of query that take different times to process? Or are answered by two different technical support staff? This analysis suggests further investigation by the company.