Forms summaries for a Markov model from rainfall data (J.O. Ong’ala & D.B. Baird).
Options
PRINT = string tokens |
Controls printed output (counts , amounts , probabilities ); default * |
PLOT = string token |
What plots to display (probabilities ); default prob |
DAY = variate or factor |
Day as a date or a day number within the year |
LIMITS = scalar or variate |
Values to define the daily rainfall states; default 0.85 |
ORDER = scalar |
Defines the order of the Markov chain (0…5); default 1 |
HIGHORDER = scalar |
Whether to use a high-order Markov chain; (no , yes ); default no |
INITIAL = scalar or variate |
The amounts of rainfall prior to the first day; default * |
SPREADSHEET = string tokens |
What to save in a spreadsheet (counts , amounts , probabilities ); default * |
Parameters
DATA = variates |
The daily rainfall amounts |
WINDOW = scalars |
Window to plot the graph; default 3 for ORDER =0 and 1 otherwise |
TITLE = texts |
The title for the plot; default uses an automatic description |
COUNTS = tables |
Saves the counts by Markov state and day |
AMOUNTS = tables |
Saves the mean rainfall by Markov wet states and day |
PROBABILITIES = pointers |
Saves a pointer to variates of probabilities of a wet day by class |
CATEGORIES = factors |
Saves the Markov class for each day |
STATECOUNTS = pointers |
Saves a pointer to tables of counts for each state |
OUTFILE = texts |
File (with extension .gwb , or .xlsx ) to save selected spreadsheet components |
Description
RFSUMMARY
creates summaries from rainfall data for a Markov chain model analysis. The Markov model splits the days into different classes based on the history of the preceding days. This is to allow for different probabilities and amounts of rainfall on a day according to what happened previously: for example, in most climates, it is more likely to rain on a day following previous rain.
The daily states, order and type of Markov model are specified by the LIMITS
, ORDER
and HIGHORDER
options, respectively. If the LIMITS
option is set to a scalar or variate of length one, this defines the breakpoint between dry and wet days. A small positive value treats days with less than this amount of rainfall as dry days (these are also removed from the rainfall for wet days). If LIMITS
is set to a variate of length of two or more, the rainfall states are defined as the days with rainfall less than or equal to these limits, with an extra group for rainfall greater than the top limit. The ORDER
option specifies the number of previous days to use when forming the Markov classes.
The classes are the combination of the daily states over the history length defined by ORDER
. (So there will be (NVALUES(LIMITS)+1)**(ORDER+1)
classes.) If there are two rainfall states, these are labelled w and d for wet and dry on each day. Otherwise they are labelled by the integers from 0 upwards. When there are two states, the default HIGHORDER=no
gives all the unique combinations of wet and dry days over these days. Setting HIGHORDER=yes
collapses the states to just the number of dry days preceding a wet day. For example, with ORDER
=2 and HIGHORDER=no
, the 8 states are ddd, ddw, dwd, dww, wdd, wdw, wwd and www (where d = dry day and w = wet day); with ORDER
=2 and HIGHORDER=yes
, the 6 states are ddd, ddw, dw, wd, wdd, and ww, as dwd and dww are combined into dw and wwd and www are combined into ww. ORDER
must be at between 0 and 3 for HIGHORDER=no
and between 2 and 5 for HIGHORDER=yes
.
The DAY
option gives the dates or the day number within a year (1…366), and the DATA
parameter gives the amount of rainfall on these dates. The data should be sorted into chronological order with no missing days. (Missing values should be entered for any days with no observations.) The INITIAL
option can specify the amount of rain on the days preceding the first day in DATA
; this should have ORDER
values. If INITIAL
is not set, the first ORDER
days will not contribute to the counts and amounts.
You can save the summaries with the COUNTS
, AMOUNTS
, PROBABILITIES
, CATEGORIES
and STATECOUNTS
parameters:
COUNTS
saves a table of counts classified by day number within the year (1…366) and Markov class (e.g. dd, wd, dw and ww);
AMOUNTS
saves a table of the sum of rainfall amounts classified by day and Markov wet classes (e.g. wd and ww);
PROBABILITIES
saves a pointer to a set of variates for each wet class giving probability of a wet day vs. a dry day for the days;
CATEGORIES
saves a factor giving the Markov class for each date; and
STATECOUNTS
saves a pointer to tables for each state defined by LIMITS
, giving the counts by Markov class and day.
Printed output is controlled by the PRINT
option, with settings:
counts
counts by day and Markov class;
amounts
amounts by day and wet Markov class; and
probabilities
probabilities by day and wet Markov class.
The summaries can be displayed in a spreadsheet by setting the SPREADSHEET
option to the following settings:
counts
creates a sheet containing the counts for each day by the Markov classes;
amounts
shows the amounts of rainfall in the wet classes; and
probabilities
shows the probability of rainfall in the wet classes.
The spreadsheet can be saved to a file by setting the OUTFILE
parameter to a Genstat or Excel spreadsheet filename (.gwb
or .xlsx
).
You can set option PLOT
=probabilities
to plot the probabilities. The TITLE
parameter can supply a title for the graph; if this not set, a descriptive title will be created from the Markov chain options. The WINDOW
parameter specifies the window to use for the graph.
Options: PRINT
, PLOT
, DAY
, LIMITS
, ORDER
, HIGHORDER
, INITIAL
, SPREADSHEET
.
Parameters: DATA
, WINDOW
, TITLE
, COUNTS
, AMOUNTS
, PROBABILITIES
, CATEGORIES
, STATECOUNTS
, OUTFILE
.
Method
The procedure calculates the class of each day, and then tabulates these to create summaries. If dates are provided in DAY
, these are converted to days in the year by the NDAYINYEAR
function. Note: the 29 of February (which is only present in leap years) is day 60. The 1st March is always day 61.
Action with RESTRICT
The DATA
or DAY
variates can be restricted to analyse a subset of the data. If both DATA
and DAY
are restricted, the restrictions must be consistent.
Reference
Ong’ala, J.O. (2011). Simplifying the Markov chain analysis of rainfall data using Genstat. MSc Thesis, Maseno University.
See also
Procedures: RFFAMOUNT
, RFFPROBABILITY
.
Commands for: Basic and nonparametric statistics.
Example
CAPTION 'RFSUMMARY example','41 years rainfall for Katumani, Kenya'; \ STYLE=meta,minor IMPORT [PRINT=summary] '%Data%/Rainfall Katumani 1961-2001.gsh' RFSUMMARY [PRINT=counts,amounts,probabilities; PLOT=probabilities; \ DAY=Date; ORDER=1; SPREADSHEET=counts,amounts,probabilities] Rainfall; \ COUNTS=RFCounts; AMOUNTS=RFAmounts; TITLE='Katumani rainfall 1961-2001' RFFPROBAB [PLOT=results] COUNTS=RFCounts; \ TITLE='Katumani rainfall probabilities 1961-2001' RFFAMOUNT [PLOT=results] COUNTS=RFCounts; \ AMOUNTS=RFAmounts; TITLE='Katumani rainfall amounts 1961-2001' RFSUMMARY [PRINT=*; PLOT=*; DAY=Date; ORDER=3; HIGHORDER=yes] Rainfall; \ COUNTS=RFCounts; AMOUNTS=RFAmounts RFFPROBAB [PLOT=results; SPREADSHEET=] COUNTS=RFCounts; \ TITLE='Katumani rainfall probabilities 1961-2001 (high order 3)' RFFAMOUNT [PLOT=results; SPREADSHEET=] COUNTS=RFCounts; AMOUNTS=RFAmounts; \ TITLE='Katumani rainfall amounts 1961-2001 (high order 3)'