Bivariate data

Bootstrap sampling can be used to obtain an approximate error distribution in any situation where individuals are randomly sampled from a population. The scatterplot below shows a bivariate data set.

How accurately does the sample correlation coefficient, r = 0.787, estimate the underlying population correlation underlying the data?

Bootstrap

We can again find an approximate error distribution using bootstrap samples selected with replacement from the data. The scatterplot below describes one such bootstrap sample. The digits again represent data values that were sampled more than once.

From each of several bootstrap samples, we can find how far their correlation coefficient is from the "population" value, 0.787 — the estimation error.

The bootstrap error distribution provides us with an approximate standard error for the correlation coefficient. The correlation coefficient from our data set, 0.787, will probably be within 2 standard errors (approx 0.065) of the underlying population value.