Standard deviation and histogram

Students usually find the standard deviation a difficult concept. Luckily, understanding its definition is much less important than knowing its properties and having a feel for what its numerical value means.

If you have understood the 70-95-100 rule, you should be able to make a fairly accurate guess at the standard deviation of a batch of values from a histogram or dot plot (without doing any calculations). About 95% of the values should be within 2 standard deviations of the mean, so after dropping the top 2.5% and bottom 2.5% of the crosses (or area of the histogram), the remainder should span approximately 4 standard deviations. So dividing this range by 4 should approximate the standard deviation.

Similarly, given the mean and standard deviation for a data set, you should be able to draw a rough sketch of a symmetric histogram with that mean and standard deviation. (It would be centred on the mean and 95% of the area would be within 2 standard deviations of this.)

Retail Store Merchandise Buyers

The table below was published in the Journal of Retailing. It describes the results from questionnaires completed by 'merchandise buyers' in 213 chain stores — the people responsible for purchasing stock that will be sold by the retailers. The buyers were classified into two groups, depending on whether they were still in the same employment after 9 months, and a major aim of the study was to compare these 'stayers' and 'leavers'.

Construct: Stayers:
(N = 177)
Mean
SD Leavers:
(N = 36)
Mean
SD
Intention to leave
Job satisfaction
Role conflict
Role ambiguity
Intrinsic orientation
DMM influence
Income
Buycenter influence
BS activities
BS relationships
8.77
17.59
10.68
7.21
17.14
2.79
3.27
7.92
9.92
22.10
3.74
2.45
2.73
2.09
3.08
1.19
1.39
2.68
2.21
3.27
11.25
16.31
11.71
8.06
17.55
2.78
2.87
8.69
9.83
22.53
4.41
3.64
2.50
2.45
2.61
1.46
1.27
2.88
2.38
2.81

The table summarises the main differences between the groups. (We refer to the paper for details of the measurements — most were obtained by aggregating the responses to a group of questions in the questionnaire.)

An Empirical Investigation of Dysfunctional Turnover Among Chain and Non-Chain Retail Store Buyers, S.M.Keaveney, Journal of Retailing, 68, 1992, p145-170.

Guessed histograms

Using the 70-95-100 rule of thumb, we can sketch a rough histogram to match each mean and standard deviation — about 70% of each histogram's area should be within s of the mean, 95% within 2s of the mean and about all within 3s. These can be used to compare the 'Stayers' and 'Buyers' for any variable.

Use the pop-up menu to investigate the differences between the Stayers and Buyers for the different variables.

The actual histograms may not be symmetric and would certainly be more 'boxy' than those above, but this is the best we can do from the available information.

Interpretation...

From the means and standard deviations, we can either use the approximate histograms (as sketched above) or use the 70-95-100 rule directly to infer that there is considerable overlap between the distributions for Stayers and Buyers for all variables. For example,

In later chapters of CAST, you will meet statistical methods that allow you to properly compare two groups such as these, but we mention here that these methods will be based on the group means and standard deviations.