Relationship between variables

In most data sets, we are interested in the relationship between the variables. This is most obviously the case when the data consist of two numerical variables (correlation and least squares give information about the relationship) or two categorical variables (a contingency table and conditional proportions help show the relationship).

Groups and relationships

Questions about differences between two or more groups can also be expressed in terms of relationships between variables if group membership is represented by a categorical variable. For example, if we have collected data about incomes from a sample of males and a sample of females, we would be interested in comparing these two groups — i.e. the relationship between income and gender.

Does a decrease in the retail price of a television increase sales?
Is there a relationship between price and sales?
Are children from large families more or less likely to go to university?
Is there a relationship between number of siblings and attendance at university?
Which of three different varieties of corn tastes best?
Is there a relationship between corn variety and taste?
Do males working in a car yard sell more cars than females?
Is there a relationship between gender and car sales?

Interpreting relationships

Relationships can be much harder to interpret than you might think

In some situations, the relationship between two variables, such as the relationship evident in a scatterplot, may not describe a meaningful 'real' relationship.