Data that are not sampled from a finite population

Sometimes data are actually sampled from a real finite population. For example, a public opinion poll may select individuals from the population of all residents in a city. The previous section showed that:

Random sampling of values from a finite population can explain the sample-to-sample variability of some data.

However there is no real finite population underlying most data sets from which the values can be treated as being sampled. The randomness in such data must be explained in a different way.

Silkworm poisoning

In an investigation of the speed of the toxic action of arsenic on silkworm larvae, 80 fourth-instar silkworm larvae weighing between 0.41 and 0.45 grams were given 0.10 mg of sodium arsenate per gram of body weight. Their survival times in seconds are given below.

Survival of poisoned silkworm larvae (seconds)
270
254
293
244
293
261
285
330
284
274
307
235
215
292
309
267
275
298
241
254
256
275
226
287
280
339
294
298
283
366
300
310
280
240
291
286
230
285
218
279
280
286
345
289
210
282
260
228
243
259
285
275
280
296
283
248
314
258
215
299
240
241
236
255
267
271
253
271
233
260
273
233
271
267
258
319
310
302
260
251

There is no real finite population from which the survival times can be considered to be sampled. However there is variability within this data set and repeating the experiment would have resulted in a different set of survival times.

Sampling from an abstract population

Random sampling from a population is such an intuitive way to explain sample-to-sample variability, we also use it to explain variability even when there is no real population from which the data were sampled.

We replace the real population that usually underlies survey data with an abstract population of all values that might have been obtained if the data collection had been repeated. We can then treat the observed data as a random sample from this abstract population.

The variation in the underlying abstract population gives us information about the variation in similar data in general.

Defining such an underlying population therefore not only explains sample-to-sample variability but also gives us a focus for generalising from our specific data.

Silkworm poisoning

It is convenient to model the variability in his data as being a sample from the infinite population of all possible measurements that could have been made from similar silkworm larvae. The variability in this hypothetical population reflects the variability in the survival times.

The distribution of survival times in the sample of poisoned silkworms provides information about the distribution of this underlying population — the distribution of survival times of silkworms given 0.10 mg of sodium arsenate per gram of body weight in general.