Bivariate data: population or sample?

Some bivariate data sets are complete populations — there is no larger underlying population of which the data are representative. The 'individuals' in such data sets commonly have names or other labels that are an inherent part of the data.

More often, we have no interest in the specific individuals from which the data are collected. The individuals are 'representative' of a larger population or process, and our main interest is in this underlying population.

Salaries of human resources managers

The scatterplot below shows the average salaries of human resources managers in each of the mainland states of the USA (except Washington DC) and their population densities.

There is a tendency for states with high population densities to have relatively high salaries. However our main interest is in the names of the states with high or low values. Click on the crosses to identify the states.

Bank branches and minorities

In order to investigate whether banks serve all communities equally, a New Jersey newspaper compiled data from each of New Jersey's 21 counties. The scatterplot below shows the number of people per bank branch in each county and the percentage of minority groups in the county.

Local residents might be interested in the specific counties, but most outsiders would want to generalise from the data to describe the relationship in a way that might describe other similar areas in the Eastern USA. How strong is the evidence that banks tend to have fewer branches in areas with large minority groups?