Creating new variables

When given a data set to analyse, you should always ask yourself whether the variables are the most useful ones to analyse. Sometimes a simple transformation provides a variable whose values are more meaningful or highlight a different aspect of the data. This is most easily illustrated with some examples.

European population

Our first example shows the populations (millions) of all European countries in 2012 and their land areas (thousand km2).

Select Population and Area from the pop-up menu to display these variables on the European map with different colours. These colours represent different aspects of the 'sizes' of the countries and add little to the map. It is more interesting to examine population density, defined by:

Density   =   Population   × 1000
Area

Select Density from the pop-up menu to display its distribution.

African calorie intake

An African example shows the calorie intake (calories per capita per day) of African countries in 2004 and 2009.

Select 2004 cals and 2009 cals from the pop-up menu. The maps highlight are effective at highlighting the parts of Africa with worst nutrition. However these two maps do not show well the changes in nutrition between 2004 and 2009. The percentage change in calorie intake can be defined by:

%Increase   =   100  ×   2009 calories − 2004 calories
2004 calories

Select %Increase from the pop-up menu to see which regions of Africa had increases or decreases in their calorie intake.

Other examples

The Gross Domestic Product (GDP) of countries is often reported. It is usually more meaningful to divide by the population size to obtain the GDP per capita.

When the values of imports or exports are reported over a series of years, it is usual to take account of inflation by dividing by a price index (e.g. the consumer price index).

Always think carefully about the variables you have been given. Are they accurate? Are there other variables that would be more informative?