RJOINT procedure

Does modified joint regression analysis for variety-by-environment data (P.W. Lane & K. Ryder).

Options

`PRINT` = string tokens	What to print (`model`, `summary`, `estimates`, `monitoring`, `graph`); default `mode`, `summ`, `esti`
`TITLE` = text	Overall title for graph
`YTITLE` = text	Y-axis title for graph
`XTITLE` = text	X-axis title for graph
`TOLERANCE` = scalar	Convergence criterion; default 0.001
`MAXCYCLE` = scalar	Maximum number of cycles; default 15
`SAVE` = regression save structure	Save structure from `MODEL` statement defining the model; default is to use the structure from the latest `MODEL` statement

Parameters

`ENVIRONMENT` = factors	The environment factor; no default
`VARIETY` = factors	The variety factor; no default
`SENSITIVITIES` = variates	To store estimates of sensitivities; default `*`
`VARMEANS` = variates	To store estimates of variety means; default `*`
`ENVEFFECTS` = variates	To store estimates of environment effects; default `*`
`ENVMEANS` = variates	To store estimates of environment means; default `*`
`SESENSITIVITIES` = variates	To store s.e.s of sensitivities; default `*`
`SEVARMEANS` = variates	To store s.e.s of variety means; default `*`
`SEENVEFFECTS` = variates	To store s.e.s of environment effects; default `*`
`DEVIANCE` = scalar	To store the residual deviance
`DF` = scalar	To store the residual d.f
`EXIT` = scalar	Exit status – set to 0 if the analysis converged, 1 otherwise

Description

Procedure RJOINT performs a modified joint regression analysis of data classified by two factors. This analysis is motivated by the study of variety-by-environment interactions in agriculture, where the two factors are varieties of some crop and environments at which experiments were carried out. The environments may be different sites within the same year, different years for the same site, or, as is more common, a combination of the two with little interest in individual year and site contributions. The intention is to characterize the sensitivity (or, inversely, the stability) of each variety to environmental effects by fitting a regression of the environment means for a variety on the average environment means. The model is thus nonlinear, of the form

y_ij = v_i + b_i × e_j + error

where v_i are variety means, e_j are environment effects (with ∑e_j=0) and b_i are the sensitivity parameters (with mean(b_i)=1). Usually, an experimenter is looking for varieties with large means and small sensitivities, to ensure a reliable crop under variable conditions. In RJOINT the factors are specified using the parameters VARIETY and ENVIRONMENT.

The data may consist of one value of the response, such as yield, for each combination of variety and yield. More often, the data are incomplete because not all varieties are tested at each environment; also, there may be multiple measurements of varieties at some environments. If the response is a count or a proportion, such as when investigating disease resistance, it will be more appropriate to use a generalized linear model based on a Poisson or binomial distribution and a log or logit link function. The model should be specified by giving a MODEL statement before calling RJOINT; for example,

MODEL yield

You can choose to fit any generalized linear model by setting the DISTRIBUTION and LINK options of MODEL: thus, to model proportions, you could give a statement like

MODEL [DISTRIBUTION=binomial; LINK=logit; DISPERSION=*]\

prop; NBINOMIAL=100

The iterative process used in the procedure is controlled by the options TOLERANCE and MAXCYCLE. At each iteration, the maximum difference between estimates of the sensitivity parameters in successive iterations is compared to the tolerance: the process ends when the differences are small enough, or when the maximum number of iterations is reached. The progress of the search can be followed by including the monitoring setting of the PRINT option, and the EXIT parameter can save a scalar with the value zero if the analysis converger and one otherwise.

Output is controlled by the option PRINT. The setting model prints a description of the model. The setting summary displays an analysis of variance (or deviance for non-Normal distributions) showing the effects of Varieties, Environments and Sensitivities: the last is the effect of allowing different sensitivities for each variety. The setting estimates displays two tables, one classified by varieties and the other by environments. For varieties the columns are: unadjusted means, final estimates of means (on the scale of the link function, if relevant), standard errors of estimates, back-transformed means (using the inverse of the link function), sensitivities, and standard errors of sensitivities. For environments the columns are: estimates of effects (on the scale of the link function, if relevant), standard errors of estimates, means (formed from the effects and the mean of the variety means), and back-transformed means (using the inverse of the link function). Finally, the setting graph plots the model. The TITLE option defines the onerall title for the graph; the default is Joint regression analysis. The YTITLE and XTITLE options define titles for the y- and x-axes, respectively; the default for the y-axix is the name of the y-variate, and the default for the x-axis is the name of the ENVIRONMENT factor.

The remaining parameters allow the following results to be saved: sensitivities, variety means, environment effects, environment means, and standard errors of sensitivities, variety means, environment effects, residual deviance and degrees of freedom. After calling the procedure, you can use the RKEEP directive to access fitted values and residuals. Other results from the fit that can be accessed via RKEEP or RDISPLAY may not be correct: for example, the number of residual d.f. shown by

RDISPLAY [PRINT=summary]

does not allow for the estimation of sensitivities.

Options: PRINT, TITLE, YTITLE, XTITLE, TOLERANCE, MAXCYCLE, SAVE.

Parameters: ENVIRONMENT, VARIETY, SENSITIVITIES, VARMEANS, ENVEFFECTS, ENVMEANS, SESENSITIVIT, SEVARMEANS, SEENVEFFECTS, DEVIANCE, DF, EXIT.

Method

The procedure uses iterative scheme (A) referred to in Digby (1979). The scheme has been generalized to deal with alternative distributions and link functions. First the environment effects are estimated with the sensitivity parameters set to 1, and then the procedure alternates between estimating the sensitivities with given environment effects and estimating environment effects with given sensitivities. Convergence is tested by comparing the maximum difference between old and new sensitivities against the criterion (default 0.001), but the maximum number of cycles (default 15) will not be exceeded. If the MAXCYCLE option is set to 1, the result is an unmodified joint regression analysis; see Finlay & Wilkinson (1963).

Action with `RESTRICT`

A restriction applied to the response variate will be taken into account. Residuals and fitted values will be formed only for the restricted subset of values. If levels of the factors are not represented in the restricted subset, then no results will be shown for those varieties and/or environments. Do not restrict the environment or variety factor differently to the response variate: results may then be incorrect.

References

Digby, P.G.N. (1979). Modified joint regression analysis for incomplete variety × environment data. Journal of Agricultural Science, Cambridge, 93, 81-86.

Finlay, K.W. & Wilkinson, G.N. (1963). The analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754.

Example

CAPTION  'RJOINT example'; STYLE=meta
VARIATE  [NVALUES=170] yield
READ     yield
2.70 2.32 2.35 1.86 4.76 5.13 2.37 3.18 3.60 3.99
     2.51 4.71 2.46 2.98 4.06 2.55 4.10
2.77 2.56 2.65 2.03 4.77 4.24 2.31 3.27 3.33 3.86
     3.25 4.10 2.97 2.91 4.25 2.35 3.95
3.13 3.72 3.47 2.66 6.08 5.74 2.45 4.16 *    4.95
     *    *    *    *    *    *    *
3.34 3.38 2.52 2.48 5.54 5.46 2.47 3.74 *    4.48
     *    *    *    *    *    *    *
3.40 3.10 2.73 2.55 5.72 5.71 2.64 3.69 4.00 4.66
     2.77 5.56 2.21 2.61 4.15 2.15 4.25
2.80 2.31 1.99 1.79 4.39 4.69 2.05 3.13 2.53 *
     2.78 4.79 3.12 2.86 3.97 2.70 4.40
2.73 2.66 2.02 2.24 5.07 5.12 2.05 3.30 3.30 *
     2.80 5.15 2.28 2.49 4.34 1.81 3.54
2.77 2.48 2.53 *    *    4.93 2.37 *    3.00 *
     2.72 *    *    *    *    *    *
2.78 3.23 2.70 2.61 6.24 5.77 2.56 3.82 4.03 4.91
     2.94 5.41 2.88 2.57 *    2.44 4.27
3.00 2.76 1.59 2.07 5.04 4.56 2.27 3.39 3.25 3.79
     *    *    *    *    *    *    * :
FACTOR   [NVALUES=170; LEVELS=10] variety
&        [LEVELS=17] site
GENERATE variety,site
MODEL    yield
RJOINT   ENVIRONMENT=site; VARIETY=variety

Updated on June 18, 2019

Was this article helpful?

Yes No