HREDUCE directive

Forms a reduced similarity matrix (referring to the GROUPS instead of the original units).

Options

`PRINT` = string token	Printed output required (`similarities);` default `*` i.e. no printing
`METHOD` = string token	Method used to form the reduced similarity matrix (`first, last, mean, minimum, maximum, zigzag`); default `firs`

Parameters

`SIMILARITY` = symmetric matrices	Input similarity matrix
`REDUCEDSIMILARITY` = symmetric matrices	Output (reduced) similarity matrix
`GROUPS` = factors	Factor defining the groups
`PERMUTATION` = variates	Permutation order of units (for `METHOD` = `firs`, `last` or `zigz`)

Description

Sometimes you may want to regard an n-by-n similarity matrix S as being partitioned into b-by-b rectangular blocks. You might then want to form a reduced matrix of similarities, between the different blocks instead of between the individual units. To do this you have to arrange for each of the b² blocks of the full matrix to be replaced by a single value. Each diagonal block must be replaced by unity. The METHOD option specifies how to replace the off-diagonal blocks, for example the maximum, minimum or mean similarity within the block. The zigzag method (Rayner 1966) is relevant in particular when the data consist of b soil samples for each of which information is recorded on several soil horizons, possibly different in the different samples. The method recognizes that certain horizons might be absent from some soil samples; this leads to finding successive optimal matches, conditional on the constraint that one horizon cannot match a horizon that has already been assigned to a higher level; after finding these optima, an average is taken for each horizon.

The SIMILARITY parameter specifies the similarity matrix for the full set of n observations; this must be present and have values. The REDUCEDSIMILARITY parameter specifies an identifier for the reduced similarity matrix, of order b; this will be declared implicitly if you have not declared it already. The factor that defines the classification of the units into groups must be specified by the GROUPS parameter. The units can be in any order, so that for example the units of the first group need not be all together nor given first. The labels of the factor label the reduced similarity matrix.

The PERMUTATION parameter, if present, must specify a variate. It defines the ordering of samples within each group, and so must be specified for methods first, last and zigzag. Within each group, the unit with the lowest value of the permutation variate is taken to be the first sample, and so on. Genstat will, if necessary, use a default permutation of one up to the number of rows of the similarity matrix.

If you set option PRINT=similarities, the values of the reduced symmetric matrix are printed, as percentages.

(Note: this directive was originally called REDUCE.)

Options: PRINT, METHOD.

Parameters: SIMILARITY, REDUCEDSIMILARITY, GROUPS, PERMUTATION.

Reference

Rayner, J.H. (1966). Classification of soils by numerical methods. Journal of Soil Science, 17, 79-92.

Example

" Examples 2:6.1.2, 2:6.1.3, 2:6.10.3, 2:6.19.2a-d, 2:6.19.3a-b & 2:6.19.4 "
" Data from Observers Book of Automobiles 1986

  16 Italian cars and 12 measurements/characteristics

   1.  engine capacity        c.c.        Engcc
   2.  number of cylinders                Ncyl
   3.  fuel tank              litres      Tankl
   4.  unladen weight         kg          Weight
   5.  length                 cm          Length
   6.  width                  cm          Width
   7.  height                 cm          Height
   8.  wheelbase              cm          Wbase
   9.  top speed              kph         Tspeed
  10.  time to 100kph         secs        Stst
  11.  carburettor/inj/diesel 1/2/3       Carb
  12.  front/rear wheel drive 1/2         Drive
"
UNITS [NVALUES=16]
VARIATE Engcc,Ncyl,Tankl,Weight,Length,Width,Height,Wbase,Tspeed,Stst,\
  Carb,Drive,Vct[1...3]
POINTER Cd; VALUES=!P(Engcc,Ncyl,Tankl,Weight,Length,\ 
  Width,Height,Wbase,Tspeed,Stst)
READ [PRINT=errors] #Cd,Carb,Drive
  1490  4  50  966 414 161 133 245 177 10.9  1  2
  1409  4  50  845 399 162 139 242 174 10.2  1  2
  2492  6  49 1160 433 163 140 251 210  8.2  1  1
  3185  8  87 1430 458 179 126 265 249  7.4  2  1
  4942 12 120 1506 449 198 113 255 291  5.8  2  1
  1995  4  70 1180 450 176 143 266 209  7.8  2  2
   965  4  35  761 338 149 146 216 134 16.8  1  2
  1585  4  55  970 426 165 141 244 180 10.0  1  2
  1714  4  55  980 426 165 141 245 150 18.9  3  2
   999  4  42  720 364 155 143 236 145 16.2  1  2
  1498  4  48  912 397 157 118 220 171 11.0  1  1
  5167 12 120 1446 414 200 107 245 286  4.9  1  1
  1585  4  45 1000 389 162 138 247 195  8.2  1  2
  1995  4  70 1150 459 175 143 266 224  7.6  2  2
  1049  4  47  790 339 151 143 216 179 11.8  1  2
  1995  4  45 1050 414 162 125 228 190  9.0  2  1 :
TEXT [VALUES=Estate,'Arna1.5','Alfa2.5',Mondialqc,\
  Testarossa,Croma,Panda,Regatta,Regattad,Uno,\
  X19,Contach,Delta,Thema,Y10,Spider] Carname
FACTOR [Carname; LEVELS=16] Fcar; VALUES=!(1...16)
SYMMETRICMATRIX [ROWS=Carname] Carsim
" Form similarity matrix between cars."
FSIMILARITY [SIMILARITY=Carsim; PRINT=similarities] #Cd,Carb,Drive;\ 
  TEST=4(cityblock),4(Euclidean),2(cityblock),2(simplematch)
" Form reduced similarity matrix for makers."
FACTOR [LABELS=!t(Fiat,'Alfa Romeo',Lancia,Ferrari,Lamborghini,\
  Pinninfarina)] Maker; VALUES=!(2,2,2,4,4,1,1,1,1,1,1,5,3,3,3,6)
SYMMETRICMATRIX [ROWS=Maker] Makersim
HREDUCE [PRINT=similarities; METHOD=mean] Carsim;\ 
  REDUCEDSIMILARITY=Makersim; GROUPS=Maker

Updated on March 7, 2019

Was this article helpful?

Yes No