RFSUMMARY procedure

Forms summaries for a Markov model from rainfall data (J.O. Ong’ala & D.B. Baird).

Options

`PRINT` = string tokens	Controls printed output (`counts`, `amounts`, `probabilities`); default `*`
`PLOT` = string token	What plots to display (`probabilities`); default `prob`
`DAY` = variate or factor	Day as a date or a day number within the year
`LIMITS` = scalar or variate	Values to define the daily rainfall states; default 0.85
`ORDER` = scalar	Defines the order of the Markov chain (0…5); default 1
`HIGHORDER` = scalar	Whether to use a high-order Markov chain; (`no`, `yes`); default `no`
`INITIAL` = scalar or variate	The amounts of rainfall prior to the first day; default `*`
`SPREADSHEET` = string tokens	What to save in a spreadsheet (`counts`, `amounts`, `probabilities`); default `*`

Parameters

`DATA` = variates	The daily rainfall amounts
`WINDOW` = scalars	Window to plot the graph; default 3 for `ORDER`=0 and 1 otherwise
`TITLE` = texts	The title for the plot; default uses an automatic description
`COUNTS` = tables	Saves the counts by Markov state and day
`AMOUNTS` = tables	Saves the mean rainfall by Markov wet states and day
`PROBABILITIES` = pointers	Saves a pointer to variates of probabilities of a wet day by class
`CATEGORIES` = factors	Saves the Markov class for each day
`STATECOUNTS` = pointers	Saves a pointer to tables of counts for each state
`OUTFILE` = texts	File (with extension `.gwb`, or `.xlsx`) to save selected spreadsheet components

Description

RFSUMMARY creates summaries from rainfall data for a Markov chain model analysis. The Markov model splits the days into different classes based on the history of the preceding days. This is to allow for different probabilities and amounts of rainfall on a day according to what happened previously: for example, in most climates, it is more likely to rain on a day following previous rain.

The daily states, order and type of Markov model are specified by the LIMITS, ORDER and HIGHORDER options, respectively. If the LIMITS option is set to a scalar or variate of length one, this defines the breakpoint between dry and wet days. A small positive value treats days with less than this amount of rainfall as dry days (these are also removed from the rainfall for wet days). If LIMITS is set to a variate of length of two or more, the rainfall states are defined as the days with rainfall less than or equal to these limits, with an extra group for rainfall greater than the top limit. The ORDER option specifies the number of previous days to use when forming the Markov classes.

The classes are the combination of the daily states over the history length defined by ORDER. (So there will be (NVALUES(LIMITS)+1)**(ORDER+1) classes.) If there are two rainfall states, these are labelled w and d for wet and dry on each day. Otherwise they are labelled by the integers from 0 upwards. When there are two states, the default HIGHORDER=no gives all the unique combinations of wet and dry days over these days. Setting HIGHORDER=yes collapses the states to just the number of dry days preceding a wet day. For example, with ORDER=2 and HIGHORDER=no, the 8 states are ddd, ddw, dwd, dww, wdd, wdw, wwd and www (where d = dry day and w = wet day); with ORDER=2 and HIGHORDER=yes, the 6 states are ddd, ddw, dw, wd, wdd, and ww, as dwd and dww are combined into dw and wwd and www are combined into ww. ORDER must be at between 0 and 3 for HIGHORDER=no and between 2 and 5 for HIGHORDER=yes.

The DAY option gives the dates or the day number within a year (1…366), and the DATA parameter gives the amount of rainfall on these dates. The data should be sorted into chronological order with no missing days. (Missing values should be entered for any days with no observations.) The INITIAL option can specify the amount of rain on the days preceding the first day in DATA; this should have ORDER values. If INITIAL is not set, the first ORDER days will not contribute to the counts and amounts.

You can save the summaries with the COUNTS, AMOUNTS, PROBABILITIES, CATEGORIES and STATECOUNTS parameters:

COUNTS saves a table of counts classified by day number within the year (1…366) and Markov class (e.g. dd, wd, dw and ww);
AMOUNTS saves a table of the sum of rainfall amounts classified by day and Markov wet classes (e.g. wd and ww);
PROBABILITIES saves a pointer to a set of variates for each wet class giving probability of a wet day vs. a dry day for the days;
CATEGORIES saves a factor giving the Markov class for each date; and
STATECOUNTS saves a pointer to tables for each state defined by LIMITS, giving the counts by Markov class and day.

Printed output is controlled by the PRINT option, with settings:

counts counts by day and Markov class;
amounts amounts by day and wet Markov class; and
probabilities probabilities by day and wet Markov class.

The summaries can be displayed in a spreadsheet by setting the SPREADSHEET option to the following settings:

counts creates a sheet containing the counts for each day by the Markov classes;
amounts shows the amounts of rainfall in the wet classes; and
probabilities shows the probability of rainfall in the wet classes.

The spreadsheet can be saved to a file by setting the OUTFILE parameter to a Genstat or Excel spreadsheet filename (.gwb or .xlsx).

You can set option PLOT=probabilities to plot the probabilities. The TITLE parameter can supply a title for the graph; if this not set, a descriptive title will be created from the Markov chain options. The WINDOW parameter specifies the window to use for the graph.

Options: PRINT, PLOT, DAY, LIMITS, ORDER, HIGHORDER, INITIAL, SPREADSHEET.
Parameters: DATA, WINDOW, TITLE, COUNTS, AMOUNTS, PROBABILITIES, CATEGORIES, STATECOUNTS, OUTFILE.

Method

The procedure calculates the class of each day, and then tabulates these to create summaries. If dates are provided in DAY, these are converted to days in the year by the NDAYINYEAR function. Note: the 29 of February (which is only present in leap years) is day 60. The 1st March is always day 61.

Action with `RESTRICT`

The DATA or DAY variates can be restricted to analyse a subset of the data. If both DATA and DAY are restricted, the restrictions must be consistent.

Reference

Ong’ala, J.O. (2011). Simplifying the Markov chain analysis of rainfall data using Genstat. MSc Thesis, Maseno University.

Example

CAPTION 'RFSUMMARY example','41 years rainfall for Katumani, Kenya'; \
   STYLE=meta,minor
   
IMPORT [PRINT=summary] '%Data%/Rainfall Katumani 1961-2001.gsh'

RFSUMMARY [PRINT=counts,amounts,probabilities; PLOT=probabilities; \
   DAY=Date; ORDER=1; SPREADSHEET=counts,amounts,probabilities] Rainfall; \
   COUNTS=RFCounts; AMOUNTS=RFAmounts; TITLE='Katumani rainfall 1961-2001'
RFFPROBAB [PLOT=results] COUNTS=RFCounts; \
   TITLE='Katumani rainfall probabilities 1961-2001'
RFFAMOUNT [PLOT=results] COUNTS=RFCounts; \
   AMOUNTS=RFAmounts; TITLE='Katumani rainfall amounts 1961-2001'
          
RFSUMMARY [PRINT=*; PLOT=*; DAY=Date; ORDER=3; HIGHORDER=yes] Rainfall; \
   COUNTS=RFCounts; AMOUNTS=RFAmounts
RFFPROBAB [PLOT=results; SPREADSHEET=] COUNTS=RFCounts; \
   TITLE='Katumani rainfall probabilities 1961-2001 (high order 3)'
RFFAMOUNT [PLOT=results; SPREADSHEET=] COUNTS=RFCounts; AMOUNTS=RFAmounts; \
   TITLE='Katumani rainfall amounts 1961-2001 (high order 3)'

Updated on February 10, 2022

Was this article helpful?

Yes No