The shape of a distribution

Many different distributions have the same mean and standard deviation.

The mean and standard deviation hold no information about the shape of a distribution, other than its centre and spread.

In particular, the mean and standard deviation give no indication about whether a data set contains:

These are important features of a data set and should influence the analysis that you perform and the conclusions that you reach. In particular, if you ignore outliers or clusters, you could easily reach the wrong conclusions.

It is therefore essential that you look at the distribution with a dot plot, histogram or box plot before 'condensing' the data into a mean and standard deviation for further analysis.


Distributions with the same mean and standard deviation

The following four data sets all contain the same number of values, n = 100, and have the same mean,  = 248.5, and standard deviation, s = 91.1, but should be analysed in different ways.

Symmetric bell-shaped distribution

The data set above has a distribution whose shape is what would be imagined from the mean and standard deviation. Its shape is well described by these two summary statistics.
Outlier

This data set contains an outlier. It is probably a measurement or recording error or the 'individual' is in some other way different from the rest of the data and should not be analysed with them.
After deleting the outlier, the mean reduces from 248.5 to 241.4 and the standard deviation drops from 91.1 to 57.6. The measurements are therefore much less variable than the raw standard deviation suggests.
Clusters

In this data set, the values separate into two distinct clusters. The researcher should investigate what is different about the 'individuals' in the two clusters. For example, annual rainfalls may have been recorded in two types of years (e.g. La Nina and El Nino), or two different varieties of maize may have been grown in a survey of crop yields.
The two clusters have different means and the standard deviation within each cluster is much smaller than 91.1, so again, the overall mean and standard deviation do not adequately describe the data.
Skew distribution

This data set is skew with a long tail towards the high values. The 70-95-100 rule suggests that about 15% of values are below  - s and 15% above  + s (and 70% between these values), but this distribution has no values lower than  - s, but 14% are above  + s, 6% are above  + 2s and 2% are above  + 3s.
The 70-95-100 rule does not give a good impression of this distribution — the percentages are only approximately correct for fairly symmetric, bell-shaped distributions.

In the presence of an outlier, clusters or skewness, the mean and standard deviation fail to capture an important aspect of the distribution's shape. They are particularly misleading in the presence of outliers or clusters.

The diagram below shows the four distributions together as histograms to make comparison easier.

A histogram or dot plot is needed to describe the clustered distribution, but a box plot would capture the main features of the skew distribution and distribution with an outlier.