How much data do I need to collect?
This is one of the most common questions that statistical consultants are asked. Data collection is expensive, so there is a clear desire to keep sample sizes as low as possible. However insufficient data means that the parameters of interest cannot be estimated accurately enough.
How many plots of corn should be planted?
We now give an example that finds the sample size needed to estimate a population mean with specified accuracy.
Obtaining the sample size by solving an equation
Provided the sample size will be reasonably large, it is possible to replace the t-value in the above inequality by 1.96. For our estimate to be within k of ยต with probability 0.95, we therefore need
This inequality can be re-written in the form
In practice, it is best to increase n a little over this value in case the sample standard deviation was wrongly guessed.