If you don't want to print now,
Time series data
Data sets whose measurements are made sequentially at regular intervals are called time series. We often want to forecast future values of a time series.
The importance of plotting
As with other data structures, the information in a time series is most easily understood from a graphical display. A time series plot is a type of scatterplot whose horizontal axis shows the time-ordering of the values. Successive crosses are often joined by lines.
Types of pattern
Trend | Seasonal variation |
---|---|
Trend refers to long-term increases and decreases in the values.
![]() |
This is often evident in monthly or quarterly data and refers to a pattern
that is repeated each year.
![]() |
Cyclic variation (autocorrelation) | Random fluctuations |
---|---|
This arises when values are similar to adjacent values,
making irregular waves or cycles .
![]() |
These are 'ups and downs' in a time series that do not correspond to trend, seasonal variation or autocorrelation.
![]() |
Most time series show more than one of these patterns to some degree.
Displaying several time series on the same plot
Several related time series can be superimposed with different colours on the same display, making comparisons easier. The crosses may be omitted.
Smoothing
Random fluctuations in a time series are usually noise that can obscure trend and other signal in the series. The values can be smoothed to reduce these random fluctuations and show the systematic movement in the series more clearly.
smoothed value = centre ( original value and adjacent values )
Moving averages
A 3-point moving average replaces the value at time i by
Moving averages are also called running means. In the example below, a 7-point moving average replaces each value with the mean of it and the 3 adjacent values on each side.
The more adjacent values used, the greater the smoothing.
Ends of the series
Note that moving averages cannot be used to smooth the values at the two ends of the time series.
Moving average with odd and even run lengths
A moving average provides a smoothed value at the middle of the times of the values being averaged.
If averaging over an even number of values, the smoothed value is for a time between those of the data values, such as "year 2005.5".
A second stage of averaging for even run lengths
To provide smoothed values at the same times as the raw data, we often take a further 2-point moving average.
This is equivalent to giving half weight to the two outermost values. If based on moving averages of 4, this is called a 4-point centred moving average.
These centred averages are particularly useful when analysing seasonal data. For example, 12-point centred moving averages are often used for monthly data.
Outliers and running medians
Since medians are less sensitive to outliers than means, a more robust alternative to running means replaces each value by the median of it and adjacent values. A 3-point running median is:
and higher-order running medians will use more adjacent values.
Comparison of means and medians
Running medians, followed by moving averages
To take advantage of the best features of both moving averages and running medians, these two techniques are often applied sequentially.
Smoothing up to the end of the series
We are usually most interested in the latest values in a time series, but moving averages cannot provide smoothed values at the two ends of the time series. Exponential smoothing works up to the end of the series:
where the smoothing constant, a, is a value between 0 and 1. The smoothed value is a 'weighted average' of the actual value at that time and the previous smoothed value.
Alternative formula
The formula can also be expressed as
For example, if a = 1/2 ,
The smoothed value puts more weight on the recent past (which is an intuitively sensible thing to do).
Forecasting future values
If the most recent value is at time i, we forecast the value at time i + k to be the last exponentially smoothed value,
Time series with trend
If the time series has an increasing trend, exponential smoothing will tend to underestimate the trend. Similarly, the smoothed series will be too high if there is a decreasing trend.
Do not use exponential smoothing on a time series with trend.
Least squares smoothing of adjacent values
Another method that provides smoothed values up to both ends of a time series is called lowess (locally weighted scatterplot smoothing). When used with time series, it is similar to running means except that instead of using the average of values at adjacent times, it fits a least squares line through them and uses this least squares line to estimate the smoothed value.
The number of adjacent points used for each smoothed value can be adjusted. As this 'window' becomes wider, the values are smoothed more, but if it is too wide, detail is lost.
Since a separate least squares line must be fitted to obtain each smoothed value, a computer must be used to apply this method.
Local smoothing of scatterplots
Lowess can be used to smooth time series but was originally developed as a general way to draw a smooth curve on any type of scatterplot. Again, the smoothed value for any observation is obtained by fitting a least squares line to the observations with adjacent values for the explanatory variable. The fitted values that are obtained in this way are joined with lines.
(Most computer software implements a version of lowess that is a actually bit more complex than has been described here but our simpler version gives a good flavour of the method.)
Least squares
Moving averages provide a good description of the trend in a time series but are less useful for forecasting future values. For forecasting, it is better to describe trend with a mathematical equation,
trend = function ( time )
The simplest such model is a linear model,
trend = b0 + b1 time
b0 and b1 can be estimated by least squares to minimise
Recoding the years
The large intercept in the example above is avoided if the years are recoded so that some year within the range of the data becomes "year 0".
trend = b0 + b1 (time − 1960)
This model is equivalent and gives the same fitted values and forecasts.
Quadratic models
If the trend in a time series is nonlinear, a linear model should not be used. A simple model that can explain some simple types of curvature is a quadratic model:
trend = b0 + b1 time + b2 time2
This has three parameters that can be adjusted to improve the fit of the model. Residuals are again defined as
ei = yi − trendi
and the least squares estimates of b0, b1 and b2 are the values that minimise the residual sum of squares,
Σ ei2
Using linear and quadratic models for forecasting
After fitting a linear or quadratic model by least squares, forecasting is simply a matter of inserting future time values into its equation.
Dangers in forecasting
It is important to realise that the forecasts from linear or quadratic models are highly dependent on the type of line or curve that is chosen for modelling. The dangers are the same as those for extrapolation in bivariate relationships.
Beware forecasting many time periods into the future — the shape of the actual trend line might be different from your model.
Cubic and higher-degree polynomial models
If a quadratic model does not adequately describe the shape of the trend in a time series, it is tempting to try to further increase the order of the polynomial,
trend = b0 + b1 time + b2 time2 + b3 time3 + ...
This kind of polynomial model can also be fitted by least squares.
A polynomial of degree 3 or 4 often provides a fairly smooth description of trend but polynomial models usually behave badly (with sudden increases or decreases) beyond the data points, so...
Polynomial models of degree greater than 2 should not be used for forecasting.
Residuals
The residuals for a time series model subtract the trend from the values and are called the detrended values,
ei = yi − trendi
If the model under consideration fits well, there should be no pattern in the residuals — each should have the same chance of being positive or negative.