UNSTACK procedure

Splits vectors into individual vectors according to levels of a factor (R.W. Payne).

Options

`DATASET` = factor	Factor identifying the unstacked data sets
`IDSTACKED` = factors	Factors identifying how the units of the unstacked data sets should be matched
`IDUNSTACKED` = factors	Factors defined to identify these units in the unstacked vectors
`MVINCLUDE` = strings	Which missing values to include (`datasets`, `idstacked`); default `*` i.e. none

Parameters

`STACKEDVECTOR` = variates, factors or texts	Vectors to be unstacked
`DATASETINDEX` = scalars or texts	Level or label of the `DATASET` factor indicating the group whose units are to be stored in the `UNSTACKEDVECTOR`; default takes the levels of `DATASET` one at a time (and then recycling this list to match the other parameters)
`UNSTACKEDVECTOR` = variates, factors or texts	Unstacked vectors

Description

UNSTACK allows you to split up (or unstack) vectors into individual vectors. The contents of the individual vectors are determined by a factor, specified by the DATASET option. In the simplest case, each original (stacked) vector is split into several new (unstacked) vectors, one for each level of DATASET. The process assumes that the sets are “replicate” sets of data. For example DATASET might correspond to days on which identical sampling schemes were followed. In the most straightforward case, each set contains the same number of observations all stored in an identical order. However, if the observations are in different orders or if some are absent in some of the sets, you can use the IDSTACKED option to specify one or more factors to identify the matching observations within the sets. The IDUNSTACKED option then allows you to save new factors to indicate where the observations are stored in the new (unstacked) vectors. The unstacked vectors are all of the same length, and missing values are inserted for absent observations.

The MVINCLUDE option controls the inclusion of missing values in the unstacked vectors, with the following settings:

`idstacked`	includes units with missing values for levels of the `IDSTACKED` factors that do not occur in the data set (otherwise these are omitted), and
`datasets`	stacked vectors that correspond to data set indexes that do not occur in the data are defined and filled with missing values (otherwise these are left undeclared, and a warning is given).

By default none of these are included.

There are three parameters. STACKEDVECTOR lists the vectors (variates, factors or texts) that are to be split up. DATASETINDEX specifies a level of the DATASET factor for each member of the STACKEDVECTOR list, and UNSTACKEDVECTOR specifies a new vector to store the units of the STACKEDVECTOR corresponding to that DATASETINDEX. So, for example

UNSTACK [DATASET=Days] 5(Weight,Height);\

DATASETINDEX=1,2,3,4,5;\

UNSTACKEDVECTOR=W1,W2,W3,W4,W5,H1,H2,H3,H4,H5

would put the weight measurements made on days 1-5 into W1, W2, W3, W4 and W5, respectively, and the height measurements into H1, H2, H3, H4 and H5. (The construct 5(Weight,Height) is equivalent to typing Weight five times and then Day five times, and the DATASETINDEX list 1,2,3,4,5 is repeated twice so that it matches the lengths of the other parameter lists.) This method of specification means that you are free to list the vectors and levels in whatever order is most convenient. For example

UNSTACK [DATASET=Days] (Weight,Height)5;\

DATASETINDEX=2(1,2,3,4,5);\

UNSTACKEDVECTOR=W1,H1,W2,H2,W3,H3,W4,H4,W5,H5

lists them in group order rather one stacked vector at a time. If DATASETINDEX is not specified, the levels of DATASET are taken in order one at a time (and recycled if necessary).

Option: DATASET, IDSTACKED, IDUNSTACKED, MVINCLUDE.

Parameter: STACKEDVECTOR, DATASETINDEX, UNSTACKEDVECTOR.

Method

The vectors are unstacked using the standard Genstat manipulation commands, including SUBSET and EQUATE.

Action with `RESTRICT`

Any restrictions on the vectors are ignored.

Example

CAPTION  'UNSTACK example'; STYLE=meta
FACTOR   [LEVELS=31; VALUES=1...31,1...30] Day
FACTOR   [LABELS=!t(March,April); VALUES=31(1),30(2)] Month
VARIATE  [NVALUES=61] Rainfall,Temperature
READ     Rainfall,Temperature
  2.7 11.7  2.9  7.6  1.7  8.0  4.2  9.4  4.1  3.7
  0.2 11.4  2.5  6.3  3.0 11.9  0.3  4.5  0.6 10.4
  3.0 11.3  0.3  7.3  4.2 12.0  2.5 10.0  3.9  9.0
  1.2  4.6  1.4  5.7  3.7 11.8  2.9 11.9  2.7 10.4
  0.9  3.0  4.7 10.4  3.5  6.0  3.4  9.2  3.3  5.6
  4.8  7.5  1.9 11.7  0.9 10.3  1.1  3.4  0.2  5.7
  1.0  4.1  0.1  6.0  3.2 11.0  0.6  4.4  1.2 11.2
  1.7  9.1  1.4  3.6  3.1  8.2  3.0  9.7  0.9  7.3
  1.8  3.6  3.7 12.5  3.9  7.9  1.4  9.3  4.6 12.2
  0.8  4.6  1.9  5.8  3.1  8.8  3.6 11.5  2.8  8.4
  2.8 11.7  0.3  6.1  4.5  4.5  4.2  6.2  1.6 12.5
  2.7  5.8  2.7  5.5  2.7  9.6  1.8  5.8  0.4  9.1
  0.8  3.6  :
PRINT    Month,Day,Rainfall,Temperature; DECIMALS=0,0,1,1
UNSTACK  [DATASET=Month; IDSTACKED=Day; IDUNSTACK=Nday]\
         2(Rainfall,Temperature); DATASETINDEX='March','April';\
         UNSTACKEDVECTOR=MarchRain,AprilRain,MarchTemp,AprilTemp
PRINT    Nday,MarchRain,AprilRain,MarchTemp,AprilTemp; DECIMALS=0,4(1)

Updated on January 12, 2022

Was this article helpful?

Yes No