Distribution of proportion

In a random sample from a categorical population with probability π of success, the number of successes, x , has a binomial distribution,

X  ~  binomial (n,  π)

The sample proportion,  p  =  x / n, has a distribution with the same shape but scaled by a factor 1/n. From the properties of the binomial distribution, its distribution has mean and standard deviation

μp  =  π

σp  = 

Distribution of estimation error

When the proportion p is used to estimate π, the estimation error is p - π. The error distribution therefore has the same shape as that of p, but is shifted to have mean zero. The bias and standard error of the sample proportion are therefore

bias  =  μerror  =  0

standard error  =  σerror  = 

Standard error from data

Unfortunately, the formula for the standard error of p involves π, and this is unknown in practical problems. To get a numerical value for the standard error, we therefore replace π with our best estimate of its value, p .

bias  =  μerror  =  0

standard error  =  σerror  = 

Survival of fruit flies on heat-treated mangoes

The Queensland fruit fly, Bactrocera tryoni, can lay eggs in mangoes, so Australian mangoes must be treated before they can be exported to most international markets.

An experiment was conducted to determine the effectiveness of heat treatment of mangoes to kill fruit fly eggs. The table below shows the published results when mangoes containing 5,903 eggs were heat treated to a core temperature of 43 degrees Celsius.

Surviving adults 637
Eggs killed 5,266
Total eggs 5,903

What is the probability that a fruit fly egg will survive?

There is some underlying probability, π, that an egg will survive the heat treatment and our best estimate is the sample proportion, p = 637/5903 = 0.1079.

How accurate is this estimate?

The number surviving should have a binomial distribution,

X  ~  binomial (n = 5903,  π)

The diagram below initially shows this distribution with π replaced by our best estimate, p = 0.1079.

Use the pop-up menu to display the (approximate) distributions of the sample proportion, p, and the estimation error. Observe that all three distributions have the same basic shape — only the scale on the axis changes.

From the error distribution (or from the standard error), it is unlikely that the estimate of survival, p†=†0.1079, will be more than 0.01 in error.