Small groups or clusters of 'individuals'
In some data sets, the basic 'individuals' are arranged in fairly small groups or clusters.
In data sets of these kinds, a categorical variable could distinguish between the groups, but there would be a large number of possible values and these values (the plant or houshold names) are of little direct interest.
Data at different levels
Furthermore, some measurements are usually recorded at group level rather than individual level. For example, the age and height of the plant, or the dwelling type and distance from shops of the household are all recorded at group level. These values could be stored in a separate group-level data matrix.
Each data matrix can be separately analysed.
Household survey
The data below are of a type that are commonly collected in surveys of consumer purchases and attitudes. A sample of houses is selected and information is collected from all individuals in these houses. For illustrative purposes, we have only shown data from 22 households and have only shown one variable at household level and three at person level.
The data matrix on the left shows household-level information — the house address and its distance from the nearest supermarket (km). The data matrix on the right shows data that were collected from the individuals in the households — age, gender and income.
Note that the households have also been given a unique number and each individual has a 'household ID' variable that links it to its household. Click on any household or individual to see the corresponding entries in the two tables.
Moving information between the data matrices
Information can be exchanged between the two data matrices in order to analyse both sets of data together.
Information can be obtained from multi-level data by examining both the group-level and individual-level data matrices.
Household survey
In the household survey above, the 'Distance to shops' data from household level can be copied to the individual-level data matrix to be analysed at individual level.
In a similar way, information from the individuals in a household can be used to create new variables at household level. Select Hh size (household size), Min age (minimum age) and Total income from the pop-up menu. Their values for any household are determined from variables in the individual-level data matrix for the individuals in the household. (Again, click any household's row to highlight the individuals in it.)
Analysis of the two data matrices
The data-analysis methods that will be described in CAST can be used with both the group-level and individual-level data matrices. It is however important to recognise the difference between analysing the data at these two levels.
Consider a household with 1 individual that is 2 km from shops and another household with 9 individuals that is 1 km from shops.
Properly analysing multi-level data and interpreting the results of the analysis requires a lot of careful thought!
Although you can analyse both the group-level and individual-level data matrices using the methods that will be described in CAST, advanced statistical methods are needed to fully analyse multi-level data.