1. Home
2. MAREGRESSION procedure

# MAREGRESSION procedure

Does regressions for single-channel microarray data (P. Brain, R.W. Payne & D.B. Baird).

### Options

`PRINT` = string tokens Controls printed output (`model`, `summary`); default `*` i.e. none Defines the regression model over the slides Weights for the regression; default 1 Offset; default `*` i.e. none How to treat the constant (`estimate`, `omit`); default `esti` Limit for expansion of model terms; default 3 Whether to assign all possible parameters to factors and interactions (`yes`, `no`); default `no` Whether to pool the information on each term in the analysis of variance (`yes`, `no`); default `no` Type of residuals to form (`deviance`, `Pearson`, `simple`); default `devi` What results to save in a book of spreadsheets (`aov`, `residuals`, `fittedvalues`, `estimates`, `se`, `testimates`, `prestimates`); default `*` i.e. none

### Parameters

`Y` = variates or pointers Y-values for each set of analyses Defines the probe information for each analysis Defines the slide information for each analysis Slide ID’s that can be compared with the labels or levels of the `SLIDES` factor to ensure that the slide order is correct in each analysis Saves the probes names that have been generated to label the rows of the output structures from each analysis Saves residuals from each set of analyses Saves fitted values from each set of analyses Saves estimates from each set of analyses Saves s.e.’s of estimates Saves t-statistics of estimates Saves t-probabilities of estimates Saves degrees of freedom for the model terms or variates in each analysis of variance Saves sums of squares for the model terms in each analysis of variance Saves mean squares for the model terms in each analysis of variance Saves degrees of freedom from the “residual” lines in each analysis of variance Saves sums of squares from the “residual” lines Saves mean squares from the “residual” lines Saves degrees of freedom from the “total” lines in each analysis of variance Saves sums of squares from the “total” lines Saves mean squares from the “total” lines Saves variance ratios for the model terms in each analysis of variance Saves probabilities of the variance ratios

### Description

Procedure `MAREGRESSION` does regression analyses for microarray experiments with single-channel data. The experiment is assumed to consist of several slides, each of which represents a unit of the design. The model for the regressions is specified by the `TERMS`, `WEIGHTS`, `OFFSET`, `CONSTANT`, `FACTORIAL` and `FULL` options, which operate exactly as in ordinary regression (see the `MODEL`, `TERMS` and `FIT` directives). The lengths of the factors and variates in the model should be the same as the number of slides (and `MAREGRESSION` will give a failure diagnostic if this is not so).

Each slide contains data on a (large) number of probes or genes. `MAREGRESSION` does a between-slide analysis of the data on each probe. So, it uses the mean value for any probe observations that are replicated within a slide, and prints a warning if the replication of any probe differs from slide to slide. The data from the slides are specified by the `Y`, `PROBES` and `SLIDES` parameters, and can be in either a stacked or an unstacked representation. With stacked data, the observations from all the slides are supplied by the `Y` parameter in a single variate, the `SLIDES` factor indicates the slide on which each observation was made, and the `PROBES` factor specifies the probe. With unstacked data, the `Y` parameter supplies a pointer with a variate for each slide. The `PROBES` factor or text specifies the probes (which must be in the same order on every slide). The `SLIDES` factor can be omitted, or it can supply a text defining a label for each slide. The `CHECK` parameter can supply a text or variate to be compared with the labels or levels of the `SLIDES` factor, to verify that the slides have been specified in the correct order.

The `RESIDUALS` and `FITTEDVALUES` parameters allow you to save the residuals and fitted values from the regressions. These are defined as matrices, with a row for each probe, and a column for each slide. The `RMETHOD` option indicates what sort of residual to form, as in the other Genstat regression commands. By default, standardized residuals are formed, but you can set `RMETHOD=simple` to form simple residuals instead.

The `ESTIMATES`, `SE`, `TESTESTIMATES` and `PRESTIMATES` parameters save the estimates, standard errors, t-statistics and t-probabilities for the parameters in the regression model. These are defined as matrices, with a row for each probe, and a column for each parameter.

The `DF`, `SS`, `MS`, `RDF`, `RSS`, `RMS`, `TDF`, `TSS`, `TMS`, `VR` and `PRVR` parameters store information from the analysis of variance table. (`DF`, `SS`, `MS`, `VR` and `PRVR` are from the “regression” line, `RDF`, `RSS` and `RMS` are from the “residual” line, and `TDF`, `TSS` and `TMS` are from the “total” line.) With the default setting `no` of the `POOL` option each of these is a pointer containing a variate for each term in the `TERMS` formula. The variates each have a unit for every probe. Alternatively, if you set `POOL=yes`, the parameters each have a single variate, with the values pooled over the terms.

Printed output is controlled by the `PRINT` option, with settings:

    `model` for a description of the regression model, and for a summary of the significance levels found over the probes for each parameter in the model.

The `SPREADSHEET` option allows you to save the various output components in spreadsheets.

Options: `PRINT`, `TERMS`, `WEIGHTS`, `OFFSET`, `CONSTANT`, `FACTORIAL`, `FULL`, `RMETHOD`, `SPREADSHEET`.

Parameters: `Y`, `PROBES`, `SLIDES`, `CHECK`, `IDS`, `RESIDUALS`, `FITTEDVALUES`, `ESTIMATES`, `SE`, `TESTIMATES`, `PRESTIMATES`, `DF`, `SS`, `MS`, `RDF`, `RSS`, `RMS`, `TDF`, `TSS`, `TMS`, `VR`, `PRVR`.

### Method

The analyses are performed by the `FIT` directive and by matrix calculations.

### Action with `RESTRICT`

If any of the y-variates is restricted, the analysis will involve only the units not excluded by the restriction.

Procedures: `AFFYMETRIX`, `FDRBONFERRONI`, `FDRMIXTURE`, `MAANOVA`, `MABGCORRECT`, `MAEBAYES`, `MARMA`, `MAROBUSTMEANS`, `MAVDIFFERENCE`, `MAVOLCANO`, `QNORMALIZE`, `RYPARALLEL`.

Commands for: Microarray data.

### Example

```CAPTION   'MAREGRESSION example','Analysis of 9 Arabidopis slides';\
STYLE=meta,plain
ENQUIRE   CHANNEL=4(-1); EXIST=check[1...2]; NAME=\
'%GENDIR%/Data/Microarrays/Hyb-Expressions.gsh',\
'%GENDIR%/Data/Microarrays/HybFiles.GSH'
IF VSUM(check).EQ.2
" Regression of one-channel microarray data "
MAREGRESS [PRINT=model,summary; FACTORIAL=3; TERMS=Target;\