Small groups or clusters of 'individuals'

In some data sets, the basic 'individuals' are arranged in fairly small groups or clusters.

Measurements of plants
In a study investigating plant growth, the researcher might record the nutrient content of a few leaves from each of a sample of plants. The leaves are the natural 'individuals' in the data matrix since measurements are made from each individual leaf, but they are grouped by plant.
Household income
Surveys often select households in an area then obtain information about every member of the household. (Sampling clusters of people is cheaper than sampling individuals.) The people are the natural 'individuals' in the data matrix, but they are grouped by household.

In data sets of these kinds, a categorical variable could distinguish between the groups, but there would be a large number of possible values and these values (the plant or houshold names) are of little direct interest.

Data at different levels

Furthermore, some measurements are usually recorded at group level rather than individual level. For example, the age and height of the plant, or the dwelling type and distance from shops of the household are all recorded at group level. These values could be stored in a separate group-level data matrix.

 

Each data matrix can be separately analysed.


Household survey in Malawi

The data below are based on an activity diary study that was carried out in Malawi. Individuals within households kept a record of activities carried out at 4 different times of the day. For illustrative purposes, we have only shown data from 18 households and have not shown information from the activity records.

The data matrix on the left shows household-level information — the household name and its distance from drinking water (km). The data matrix on the right shows data that were collected from the individuals in the households — age, gender and relationship within the household (head, spouse, child, mother/father, etc.).

Note that the households have also been given a unique number and each individual has a 'household ID' variable that links it to its household. Click on any household or individual to see the corresponding entries in the two tables.

Moving information between the data matrices

Information can be exchanged between the two data matrices in order to analyse both sets of data together.

Group level —> individual level
Each row of the individual-level data matrix could be augmented with values copied from the corresponding row (group) of the group-level data matrix
Individual level —> group level
It is also possible to summarise individual-level data and add it to the group-level data matrix. For example, we could add a 'maximum age' or 'household size' variable to the household-level data matrix. In the plant-and-leaf scenario described at the top of this page, an 'average leaf length' variable could be added to the plant-level data matrix..

Information can be obtained from multi-level data by examining both the group-level and individual-level data matrices.


Household survey in Malawi

In the activity-record survey in Malawi, the 'Distance to water' data from household level can be copied to the individual-level data matrix to be analysed at individual level.

In a similar way, information from the individuals in a household can be used to create new variables at household level. Select Hh size (household size), Min age (minimum age) and Ave age (average age) from the pop-up menu. Their values for any household are determined from variables in the individual-level data matrix for the individuals in the household. (Again, click any household's row to highlight the individuals in it.)

Analysis of the two data matrices

The data-analysis methods that will be described in CAST can be used with both the group-level and individual-level data matrices. It is however important to recognise the difference between analysing the data at these two levels.

Consider a household with 1 individual that is 2 km from water and another household with 9 individuals that is 1 km from water.

Household level
The average household distance from water is 1.5 km — the average of the two household distances.
Individual level
The average individual distance from water is 1.1 km — the average of the ten individual distances 1, 1, 1, 1, 1, 1, 1, 1, 1, 2.

Properly analysing multi-level data and interpreting the results of the analysis requires a lot of careful thought!

Although you can analyse both the group-level and individual-level data matrices using the methods that will be described in CAST, advanced statistical methods are needed to fully analyse multi-level data.