What is the purpose of Statistics?
Data?
When non-statisticians think of statistics, the first thing that usually comes to mind is data. Social acquaintances are likely to expect a statistician to be an expert on unemployment rates, foreign exchange rates, crime rates, and other economic and social indicators. Although a statistician may have had input into how these data were collected, the average statistician is no more likely to know their values than anyone else in the population.
Large amounts of data are routinely collected and published. Most libraries have reference sections with shelves of information published by government departments and industrial groups — we often wonder why so much effort was expended collecting and publishing so much data!
After completing an introductory statistics course, most students realise that statistical analysis is also applied to smaller data sets that are collected in business, biology, medicine, and most other disciplines. Modern textbooks are full of such data.
Statistical analysis uses data, but the data are not the goal
Statistical methods?
Data are the basic commodity of the statistics. Without data, there is no information on which to reach conclusions or base decisions. However the information in data is often not immediately obvious, especially in large data sets.
Data contain information.
The purpose of statistics is to extract information from data
Large data sets must be summarised before patterns and relationships can be seen — there is usually too much noise in the raw data to see the information that they contain. Statistical analyses use graphical and numerical methods to highlight important features of the data.
In smaller data sets, it is less important to summarise the data — the problem is usually that there is not enough information to get a clear answer to questions of importance. Statistical methods are needed to describe precision and to ensure that the highest precision is obtained.
Context is most important
Many introductory statistics courses focus on the toolkit of statistical methods that is used to extract information from large and small data sets. Although it is important to thoroughly understand these statistical methods, users of statistics are generally interested in questions in their own subject area.
Statistical methods are used to extract information from data.
However the methods themselves are not the goal
The aim of statistics is to supply useful information to people whose main area of expertise is not statistics. These people are not interested directly in either data or statistical methods. Data are collected in order to throw light on some question of importance to the engineer, biologist, climatologist, administrator, or other professional. Statistical methods are only useful if they can extract information from data to help answer discipline-specific questions. The underlying context is therefore the most important aspect of any statistical analysis.
Nobody collects data unless they are interested in the context and in what the data might tell them. Many students whose main interest is outside statistics therefore only appreciate the subject long after they have completed their 'statistical methods' course — when their job requires answers that must be based on data.
If you are not primarily a statistician, you will appreciate statistical methods when they are needed in your career!
Context in CAST
It is easy to miss the importance of statistics if you have no personal interest in the context of the data sets that you use. Even when a statistics course is targeted at learners from a specific area such as commerce or engineering, there is rarely a personal interest in the context of the specific data sets used in the course. The best we can do is to expose you to a wide range of data and contexts and assure you that the same statistical methods will be equally useful for analysing data of real interest to you later.