1. Home
  2. QNORMALIZE procedure

QNORMALIZE procedure

Performs quantile normalization (D.B. Baird).

Options

PRINT = string token What to print (summary); default summ
PLOT = string tokens What to plot (cdf, histogram, ncdf, nhistogram); default hist, nhis
METHOD = string token Whether to use means, medians or geometric means for the averaged normalized distribution (means, medians, geometricmeans); default mean
ARRANGEMENT = string token Whether to use trellis or single plots for PLOT=cdf or ncdf (single, trellis); default trel
DEVICE = scalar Device number on which to plot the graphs
GRAPHICSFILE = text What graphics filename template to use to save the graphs; default *

Parameters

DATA = variates or pointers Data values
GROUPS = factors or texts Groupings of the data values, or descriptions of the variates in the pointer
NEWDATA = variates or pointers Saves the normalized values; if this is unset, they replace the original values in DATA

Description

QNORMALIZE performs quantile normalization. This transforms the data so that each group has a common cumulative density function. The data values are specified by the DATA parameter. They can be in a single variate, with groupings specified by the GROUPS parameter. Alternatively, they can be in a pointer to separate variates, one for each group. The GROUPS parameter can be set to a text to describe the variates. The normalized values can be saved using the NEWDATA parameter. If this is not set, they replace the values in the DATA variate(s).

The METHOD option selects the way in which the overall distribution is produced from the cumulative density functions within each group, with settings:

    means takes the means;
    medians takes the medians; and
    geometricmeans takes geometric means (i.e. the mean on the log scale, back-transformed to the natural scale).

The PLOT option controls what plots are produced: histograms or cumulative density plots of the original or normalized data. By default the plots for the groups are displayed in a trellis arrangement, but you can set option ARRANGEMENT=single to display them separately, in single plots. You can use the DEVICE option to plot to a device other than the screen. The GRAPHICSFILE option specifies then supplies a template for the file names.

By default a summary is produced, giving quantiles by groups. This can be suppressed by putting option PRINT=*.

Options: PRINT, METHOD, ARRANGEMENT, DEVICE, GRAPHICSFILE.

Parameters: DATA, SLIDES, NEWDATA.

Action with RESTRICT

Any restrictions on the DATA variates are removed.

See also

Procedure: MABGCORRECT.

Commands for: Calculations and manipulation, Microarray data.

Example

CAPTION      'QNORMALIZE example',\
             'Expression values from 9 Arabidopis Slides'; STYLE=meta,plain
ENQUIRE      CHANNEL=-1; EXIST=check; NAME=\
             '%GENDIR%/Data/Microarrays/Hyb-PM_MM.gwb'
IF check
  SPLOAD     '%GENDIR%/Data/Microarrays/Hyb-PM_MM.gwb'
  CALCULATE  log2PM = log(PM)/Log(2)
  DELETE     [REDEFINE=yes] Atoms,PM,MM
  " Do normalization on just three slides for speed "
  SUBSET     [Slides.in.!T('hyb1191','hyb1192','hyb1193');SETLEVELS=yes]\
             Slides,Probes,log2PM
  " Quantile Normalization for 1 Channel Microarray Data."
  QNORMALIZE [PRINT=summary; PLOT=histogram,cdf,nhistogram; METHOD=means;\
             ARRANGEMENT=single] DATA=log2PM; GROUPS=Slides; NEWDATA=nPM
ELSE
  CAPTION    'Microarray example datasets have not been installed.'
ENDIF
Updated on March 6, 2019

Was this article helpful?