Probability and population proportion
When sampling from a finite population, the probability of any 'event' is the proportion of population values for which that 'event' happens. For example, the probability that a randomly selected household from a town contains more than two adults equals the proportion of households of that size in the town.
The same definition can be used for infinite populations (distributions), even if the population is abstract. When selecting one value from the population,
The probability of any value or range of values equals the proportion of these values in the population.
Probability and long-term proportion
There is an alternative but equivalent way to think about probability when it is possible to imagine repeatedly selecting more and more values from the population (e.g. repeating an experiment again and again).
The probability of any value or range of values is the limiting proportion of these values as the sample size increases.
The fact that the sample proportion always stabilises at the probability (i.e. the population proportion) is called the law of large numbers.
Medical insurance claims
We will consider whether a claim is made on a medical insurance policy in one year. The result is a categorical value (claim or no claim). The randomness of whether there is a claim on one policy can be modelled as being a value that is randomly sampled from an abstract infinite population in which a proportion of values are claim and the rest are no claim.
The probability that one policy will result in a claim is the proportion of claim values in this underlying population.
Alternatively, we can imagine examining more and more similar insurance policies. The probability of one policy resulting in a claim is also the limiting proportion of claims in this (imaginary) sequence of policies.
These are two different ways to think about the probability, but the value is the same.
Law of large numbers
The diagram below illustrates the fact that a sample proportion tends to a limit as the sample size increases. (The limit is the probability.) Imagine recording the whether each of a sequence of medical insurance policies has a claim during one year.
Click Find new value a few times to observe a sequence of policies. When only one policy has been observed, the proportion with claims must be either 0 or 1, but after 20 have been observed, the proportion should be somewhere near 1/3.
Continue observing additional policies until about 1000 have been recorded. By this time, the proportion of claims will have stabilised.
(Hold down the button Find 10 values to speed up the simulation.)
If we carried on infinitely long, the proportion would stabilise at a value that we call the probability of there being a claim on a policy.