Samples from a set of units, possibly stratified by factors (P.W. Lane).

### Options

`SEED` = scalar |
Seed for the random number generator; default 0 i.e. continue from previous generation |
---|---|

`NVALUES` = scalar |
Number of units from which a simple sample is to be taken; default `*` i.e. as defined by `UNITS` statement |

### Parameters

`NSAMPLE` = scalars or tables |
Number of values in simple sample, or table of numbers of values at each combination of levels of its classifying factors; no default |
---|---|

`SAMPLE` = identifiers |
Structure to store the result; no default |

### Description

Procedure `SAMPLE`

produces a random sample from a set of units. A simple sample can be obtained by setting the `NSAMPLE`

parameter to the required number in the sample, and the `NVALUES`

option to the number of units in the set. The `NVALUES`

option can be omitted if the required number of units has been defined by a `UNITS`

statement earlier in the job.

For a stratified sample, the `NSAMPLE`

option should be set to a table containing the required number of units to be sampled at each combination of levels of the factors classifying the table. The `NVALUES`

option is not then relevant as the set of units is determined by the values of the classifying factors.

The `SAMPLE`

parameter must be set to an identifier, which will be formed into a variate containing a set of `NSAMPLE`

integers in the range (1…`NVALUES`

), obtained by random sampling without replacement. The `SEED`

option can be set to define a starting value for the random numbers used to select the units. This can be omitted if some random numbers have already been generated during the current job; `SAMPLE`

will then take the numbers that continue the previous sequence.

Options: `SEED`

, `NVALUES`

.

Parameters: `NSAMPLE`

, `SAMPLE`

.

### Method

For a simple sample, a full set of units (1…`NVALUES`

) is randomly ordered and the first `NSAMPLE`

values are taken. For a stratified sample, the units are sorted according to levels of the classifying factors (after random ordering) and then the requested number of values are taken for each combination of levels.

### Action with `RESTRICT`

The factors classifying the table must not be restricted. The procedure cannot be used on a restricted set of units.

### See also

Directive: `CALCULATE`

.

Procedures: `GREJECTIONSAMPLE`

, `GRMULTINORMAL`

, `SVSAMPLE`

.

Functions: `GRBETA`

, `GRBINOMIAL`

, `GRCHISQUARE`

, `GRF`

, `GRGAMMA`

, `GRHYPERGEOMETRIC`

, `GRLOGNORMAL`

, `GRNORMAL`

, `GRPOISSON`

, `GRSAMPLE`

, `GRSELECT`

, `GRT`

, `GRUNIFORM`

.

Commands for: Calculations and manipulation.

### Example

CAPTION 'SAMPLE example',\ '1) select a random sample of 10 out of 100 units;'; STYLE=meta,plain SAMPLE [SEED=55326; NVALUES=100] NSAMPLE=10; SAMPLE=Selected PRINT Selected; DECIMALS=0 CAPTION !t('2) select specified numbers of units at each combination',\ 'of levels of two factors.') FACTOR [LEVELS=3; VALUES=12(1...3)] F1 & [LEVELS=2; VALUES=6(1,2)3] F2 TABLE [CLASSIFICATION=F1,F2; VALUES=(1,2)3] Numbers SAMPLE NSAMPLE=Numbers; SAMPLE=Chosen CAPTION 'Show which units and factor combinations have been selected.' VARIATE [VALUES=1...36] Unit RESTRICT Unit,F1,F2; EXPAND(Chosen; 36) PRINT Unit,F1,F2; DECIMALS=0 CAPTION 'Demonstrate that the correct numbers of units have been chosen.' TABULATE [CLASSIFICATION=F1,F2; COUNT=Check] PRINT Numbers,Check; DECIMALS=0