Clusters microarray slides (D.B. Baird).

### Options

`PRINT` = string tokens |
What to print (`cluster` , `pco` , `correlations` , `distances` ); default `clus` , `pco` , `corr` , `dist` |
---|---|

`PLOT` = string tokens |
What to plot (`dendrogram` , `mst` ); default `dend` , `mst` |

`DMETHOD` = string token |
What distance method to use to form the similarity matrix (`correlation` , `euclidean` , `cityblock` ); default `corr` |

`PERCENT` = scalar |
Percentage of the probes/genes to use to calculate correlations; default 100 |

`DTITLE` = text |
Title for the dendrogram |

`MTITLE` = text |
Title for the minimum spanning tree |

`WINDOW` = scalar |
Window number for the graphs; default 3 |

`DEVICE` = scalar |
Device number on which to plot the graphs |

`GRAPHICSFILE` = text |
What graphics filename template to use to save the graphs; default `*` |

### Parameters

`DATA` = variates or pointers |
Data values (i.e. log-ratios) |
---|---|

`SLIDES` = factors, texts or variates |
Identifies the slides |

`PROBES` = factors, texts or variates |
Identifies the probes or genes |

`CORRELATION` = symmetric matrices |
Saves the correlation matrix |

`DISTANCE` = symmetric matrices |
Saves the distance matrix |

### Description

`MASCLUSTER`

clusters microarray slides (or targets) together on the similarity of their responses over a number of probes or genes. The slides are grouped together so that the pattern of responses over the probes/genes are similar, with the groups as distinct as possible.

The `DMETHOD`

option specifies the distance method to use to form the similarity matrix: either `correlation`

(default), `euclidean`

, or `cityblock`

.

With large numbers of probes or genes, many may be non-informative, only being subject to random variation. So the `PERCENT`

option controls the percentage of the probes to use: if `PERCENT`

is less than the default 100, `MASCLUSTER`

uses only the top `PERCENT`

of probes according to their mean absolut response.

The log-ratios are supplied by the `DATA`

parameter. If these are in a single variate, the `SLIDE`

parameter should supply a factor to index the slides, and the `PROBES`

parameter should index the probes or genes. Alternatively you can supply a pointer containing a variate for each slide. The `SLIDES`

factor is then not required; if it is given it should just have one entry for each slide in the order of the variates in the pointer. The `PROBES`

factor is that for a single slide, and all slides must have a common layout.

The `DTITLE`

and `MTITLE`

options can supply titles for the plots of the dendrogram and minimum spanning tree, respectively, and the `WINDOW`

option specifies the window to use (by default 3). You can use the `DEVICE`

option to plot to a device other than the screen. The `GRAPHICSFILE`

option specifies then supplies a template for the file names.

Options: `PRINT`

, `PLOT`

, `DMETHOD`

, `PERCENT`

, `DTITLE`

, `MTITLE`

, `WINDOW`

, `DEVICE`

, `GRAPHICSFILE`

.

Parameters: `DATA`

, `SLIDES`

, `PROBES`

, `CORRELATION`

, `DISTANCE`

.

### Action with `RESTRICT`

Any restrictions on the `DATA`

variates are removed.

### See also

Procedures: `DMADENSITY`

, `FDRBONFERRONI`

, `FDRMIXTURE`

, `MACALCULATE`

, `MAESTIMATE`

, `MAHISTOGRAM`

, `MAPCLUSTER`

, `MAPLOT`

, `MASHADE`

, `MAVOLCANO`

, `MA2CLUSTER`

, `MNORMALIZE`

.

Commands for: Microarray data.

### Example

CAPTION 'MASCLUSTER example'; STYLE=meta ENQUIRE CHANNEL=-1; EXIST=check; NAME=\ '%GENDIR%/Data/Microarrays/ApoAIKnockOutStacked.GSH' IF check SPLOAD '%GENDIR%/Data/Microarrays/ApoAIKnockOutStacked.GSH' " Cluster Slides from APO Mouse Knock-out Data." MASCLUSTER [PRINT=correlations,cluster; PLOT=dendrogram;\ DMETHOD=correlation; PERCENT=10] DATA=cLogRatio;\ SLIDES=Slide; PROBES=NAME ELSE CAPTION 'Microarray example datasets have not been installed.' ENDIF