Histograms of data in each group

When data are collected from two groups, a histogram can be used to graphically display the distribution of values in each group.

Satellite classification of land use

Natural resource managers often use satellite data to classify land use. The dotplot below shows near-infrared intensities that were recorded by the Satellite Landsat Multispectral Scanner for 118 areas that were known to contain forest, and for another 40 areas that were known to be urban.

This diagram is 3-dimensional. Position the mouse in the middle of the diagram and drag towards the top left of the screen to rotate the plot (or click the 3D rotation button). The histogram within each group describes the distribution of near-infrared intensities of areas of the two types.

Model for each group

A single batch of numerical values is usually modelled as a random sample from some population — often a normal distribution. In a similar way, data sets that consist of measurements from two groups are often modelled as two independent random samples from two underlying hypothetical infinite populations. Normal distributions are again commonly used as models.

(The assumption of normality should be checked from graphical displays of the sample data. If the data are noticeably skew, a transformation may provide values that can be adequately modelled by normal distributions.)

Satellite classification of land use

The histograms of near-infrared intensities that were recorded from forested and urban areas both seemed fairly symmetrical, so normal distributions are reasonable models within the two groups. The diagram below shows a possible model for the data.

Click Take sample to select a random sample from each of the two normal distributions. The model claims that the real data set consists of random samples from distributions like these.