Yield of rice from 36 farmers in 4 Sri Lankan villages

Use the diagram to show that a data set in which there are values in different groups can be represented in a data matrix with a categorical variable that indicates group membership. Click on values on the left to see their representation in the data matrix.

Select Categorical variable -> Groups from the pop-up menu to change the diagram to one that shows the inverse. A categorical variable may be considered to split a set of values into groups. Again click on values to see the correspondence between the two representations.

The first data set arose from a survey of rice producers in Sri Lanka in which 36 farmers were randomly selected from 4 villages. The yield of rice (tonnes per hectare) was determined from each farmer.

The second data set is artificial. The context is the manager of a factory who is concerned by the number of days that workers took off work each year due to illness. Personnel records for 60 workers were examined, and their sick days were recorded along with their age group, old (defined as 40 or over) or young (under 40).