Inference and random samples
The examples in the previous section involved a range of different types of model for the observed data. In the remainder of this chapter, we concentrate on one particular type of model — random sampling from a population.
We assume now that the observed data are a random sample from some population.
When the observed data are a random sample, inference asks questions about characteristics of the underlying population distribution — unknown population parameters.
For random samples, the null and alternative hypotheses specify values for the unknown population parameters.
Inference about categorical populations
When the population distribution is categorical, the unknowns are the population probabilities for the different categories. To simplify, we consider populations for which one category is of particular interest ('success') and we denote the unknown probability of success by π.
The null and alternative hypotheses are therefore specified in terms of π.
Weapon detection at LAX
FAA agents tried to carry 100 weapons onto planes at LA International Airport. Of these, 72 were detected by security guards, and we are interested in whether this is consistent with the national probability of detection, 0.80.
We model detection of weapons as a random sample of 100 categorical values from a population with probability π of success (detection). The null hypothesis of interest is therefore...
H0: π = 0.80
The alternative hypothesis is
HA: π < 0.80
Telepathy experiment
An experiment is conducted to investigate whether one subject can telepathically pass shape information to another subject. A deck of cards containing equal numbers of cards with circles, squares and crosses is shuffled. One subject selects cards at random and attempts to 'send' the shape on the card to the other subject who is seated behind a screen; this second subject reports the shape imagined for the card. From 90 cards, the second subject correctly identifies 36.
This situation can be modelled as random sampling of 90 values (correct or wrong) from a categorical population in which the probability of correctly identifying the card is π. The null hypothesis of interest is therefore...
H0: π = 1/3 (guessing)
The alternative hypothesis is
HA: π > 1/3 (telepathy)
Tests about parameters of other populations
Other data sets arise as random samples from different kinds of population. For example, numerical data sets are often modelled as random samples from a normal distribution. Again, the hypotheses of interest are usually expressed in terms of the parameters of this distribution.
For example, to test whether the mean of a normal distribution is zero, the hypotheses would be...
H0: µ = 0
HA: µ ≠ 0
In the remainder of this section, we show how to test a population probability, and in the next section we will describe tests about a population mean.