Uses principal components analysis and the Tracy-Widom statistic to find the number of significant principal components to represent a set of variables (M. Malosetti & J.T.N.M. Thissen).

### Options

`PRINT` = string tokens |
What to print (`summary` , `scores` ); default `summ` |
---|---|

`NROOTS` = scalar |
Number of principal components to retain; default saves the significant components |

`PLOT` = string tokens |
What to plot (`eigenvalues` , `%variance` ); default `eige` , `%var` |

`PROBABILITY` = scalar |
Specifies the significance level; default 0.05 |

`SCALING` = string token |
Whether to scale the principal component scores by the square roots of their singular values (`singularvalues` , `none` ); default `none` |

`STANDARDIZE` = string token |
How to standardize the `DATA` variates (`frequency` , `none` ); default `freq` |

`TITLE` = text |
General title for the plots |

### Parameters

`DATA` = pointers |
Data variates; must be set |
---|---|

`SCORES` = pointers |
Pointer of variates to store the scores of the significant axes for each set of `DATA` variates |

`EVALUES` = variates |
Saves the eigenvalues of the significant principal components |

`NEFFECTIVE` = scalars |
Saves the effective number of columns of the marker data matrix |

`%VARIANCE` = variates |
Saves the percentage variances explained by the significant principal components |

`CUM%VARIANCE` = variates |
Saves the cumulative percentage variances explained by the significant principal components |

### Description

`QEIGENANALYSIS`

performs a principal component analysis on a set of variables, supplied by the `DATA`

parameter, and determines the number of significant components according to the significance level specified by the `PROBABILITY`

option (default 0.05). You can set the number of principal component axes to retain by using the `NROOTS`

option; if this is unset, the significant components are saved. By default the variates are standardized before doing the analysis, but you can set option `STANDARDIZE=none`

to suppress this. The scores of the significant principal components can be saved, in a pointer of variates, using the `SCORES`

parameter. You can set option `SCALING=singularvalues`

to scale the scores by the square roots of their singular values; by default they are not scaled.

The `PRINT`

option controls printed output, with settings:

`summary` |
to print the Tracy-Widom statistics of the significant principal components, |
---|---|

`scores` |
to print the scores of the significant principal components. |

The default is `PRINT=summary`

.

The `PLOT`

option selects the graphs to plot, with settings:

`eigenvalues` |
plots eigenvalues against the number of principal components, and |
---|---|

`%variance` |
plots the percentage variance explained and cumulative percentage variance explained, against the number of principal components. |

The default is to plot both graphs. The `TITLE`

option can supply a title for the graphs.

The `EVALUES`

parameter can be used to save the eigenvalues, and the `%VARIANCE`

and `CUM%VARIANCE`

parameters can save the percentage variances and cumulative percentage variances explained by the significant principal components. The `NEFFECTIVE`

parameter can save the effective number of columns of the marker data matrix, estimated as described by Patterson *et al.* (2006).

Options: `PRINT`

, `NROOTS`

, `PLOT`

, `PROBABILITY`

, `SCALING`

, `STANDARDIZE`

, `TITLE`

.

Parameters: `DATA`

, `SCORES`

, `EVALUES`

, `NEFFECTIVE`

, `%VARIANCE`

, `CUM%VARIANCE`

.

### Method

`QEIGENANALYSIS`

implements the method described by Patterson *et al*. (2006). It uses the `SVD`

directive to perform the principal components analysis, and iteratively calculates the Tracy-Widom statistic for the principal components until one is found to be non-significant. Missing values in the marker score data of each marker are replaced by the means of the marker scores of that marker. The significance of the principal components is assessed using tabulated values of the Tracy-Widom density function.

### Action with `RESTRICT`

Restrictions are not allowed.

### Reference

Patterson, N., Price, A.L., Reich, D. (2006). Population structure and eigenanalysis. *PLoS Genetics*, 2, e190. doi:10.1371/journal.pgen.0020190

### See also

Procedures: `QLDDECAY`

, `QMASSOCIATION`

, `QSASSOCIATION`

.

Commands for: Statistical genetics and QTL estimation.

### Example

CAPTION 'QEIGENANALYSIS example'; STYLE=meta QIMPORT [POPULATION=amp] '%GENDIR%/Examples/QAssociation_geno.txt';\ MAPFILE='%GENDIR%/Examples/QAssociation_map.txt'; MKSCORES=mk;\ CHROMOSOMES=mkchr; POSITIONS=mkpos; MKNAMES=mknames; IDMGENOTYPES=geno_id QEIGENANALYSIS [PRINT=summary; PLOT=eigenvalues,%variance;\ PROBABILITY=0.05; SCALING=none; STANDARDIZE=frequency] \ mk; scores=PCscores; %VARIANCE=explained;\ CUM%VARIANCE=cumulative PRINT PCscores,explained,cumulative QEIGENANALYSIS [PRINT=summary; PLOT=eigenvalues,%variance; NROOTS=10;\ PROBABILITY=0.05; SCALING=none; STANDARDIZE=frequency]\ mk; SCORES=PCscores2; %VARIANCE=explained2;\ CUM%VARIANCE=cumulative2 PRINT PCscores2,explained2,cumulative2