'Missing' responses
The first two types of non-sampling error are caused by failure to obtain information from some members of the target population.
Coverage error
Coverage error occurs when the sample is not selected from the target population, but from only part of the target population. As a result, the estimates that are obtained do not describe the whole target population — only a subgroup of it.
A researcher is interested in irrigation practice among wheat growers in a region. There is no database containing names and addresses of all farmers growing wheat, so questionnaires are sent to members of a local wheat-growers association. Depending on the number and characteristics of farmers who grow wheat but are not members of the association, there is potential for considerable coverage error.
A magazine aimed at teenagers conducts a poll by asking readers to mail back a questionnaire in its January issue. The results are published as providing teenage attitudes to certain issues.
This survey only covers teenagers who read the magazine, not all teenagers. There is potential for considerable coverage error if the magazine-readers are not 'typical' of all teenagers.
Non-response error
In many surveys, some selected individuals do not respond. This may be caused by ...
If non-response is related to the questions being asked, estimates from the survey are likely to be biased.
A survey is conducted to assess the number and types of books that are read in a city. Phone numbers are randomly selected from a telephone directory and these numbers are phoned in weekday evenings.
People who are not at home (and therefore do not respond) are likely to read less than those who do respond, so the sample responding will tend to overestimate book readership. Estimates of book readership would therefore be biased.
There are several other flaws in this survey that introduce further non-sampling errors. In particular, there is also coverage error since residents whose numbers are not listed in the telephone directory cannot be sampled.
Real example
In the 1936 American presidential election, there were two candidates, Roosevelt and Landon. The Literary Digest conducted a poll, aiming to predict the result of the election; its procedure was to mail questionnaires to 10 million Americans (using names from telephone books and club membership). From the 2.4 million replies, it made the following prediction:
Landon | Roosevelt | |
---|---|---|
Literary Digest's prediction | 57 | 43 |
Actual result | 38 | 62 |
Despite the large sample size (and resulting small sampling error), the non-sampling errors were extremely large in the poll.
The group who responded would have different characteristics from the whole population, hence the large difference between the Literary Digest prediction and the actual election result.
Incidentally, another pollster, George Gallup, also conducted a survey before this election. Although he only sampled 50,000 people, he put more effort into making his sample representative. His poll predicted that Roosevelt would win the election with 56 percent of the vote, much closer to the actual result.