Simulation and randomisation
Simulation and randomisation are closely related techniques. Both are based on assumptions about the model underlying the data and involve randomly generated data sets.
Randomisation is understood most easily through an example.
Comparing two groups
If random samples are taken from two populations, we are often interested in whether the populations have the same means.
If the two populations were identical, any allocation of the sample values to the two groups would have been as likely as the observed sample data. By observing the distribution of the difference in means from such randomised allocations of values to groups, we can get an idea of whether the actual difference in sample means is unusually large.
An example helps to explain this method.
Lengths of Coyotes
In a study of coyotes, 40 female and 43 males were captured in Nova Scotia. The diagram below shows the lengths (cm) of these coyotes. The mean length of the males was 2.8 cm greater than the mean length of the females, but the distributions overlap considerably. Might this difference be simply a result of randomness, or can we conclude that there is a difference in the underlying populations?
Click Randomise to randomly pick 40 of the the 83 values for the female group. If the underlying distribution of lengths was the same for males and females, each such randomised allocation would be as likely as the observed data.
Click Accumulate and repeat the randomisation several more times. Observe that the difference in means is as far from zero as 2.8 cm in about 6 percent of these randomisations when we assume the same distribution for both groups. A difference in sample means of 2.8 cm is therefore not particularly unusual if the male and female distributions are the same.
Since the actual difference is not unusually large, ...
We can conclude that there is very little evidence from the data that the mean length of female coyotes is different from the mean length of the males.