Relationship between variables

In most data sets, we are interested in the relationship between the variables. This is most obviously the case when the data consist of two numerical variables (correlation and least squares give information about the relationship) or two categorical variables (a contingency table and conditional proportions help show the relationship).

Groups and relationships

Questions about differences between two or more groups can also be expressed in terms of relationships between variables if group membership is represented by a categorical variable. For example, if we have collected data about incomes from a sample of males and a sample of females, we would be interested in comparing these two groups — i.e. the relationship between income and gender.

Does storage temperature affect the number of bruises in apples?
Is there a relationship between temperature and bruises per apple?
Are the blood pressures of adults with several brothers or sisters different from those of adults who are only children?
Is there a relationship between number of siblings and blood pressure?
Which of three different varieties of corn has greatest yield?
Is there a relationship between corn variety and yield?
Are male sparrows heavier than female sparrows?
Is there a relationship between gender and weight in sparrows?

Interpreting relationships

Relationships can be much harder to interpret than you might think

In some situations, the relationship between two variables, such as the relationship evident in a scatterplot, may not describe a meaningful 'real' relationship.