Lists the data matrix in abbreviated form.
||Defines groupings of the units; used to split the printed table at appropriate places and to label the groups; default
||Names for the rows (i.e. units) of the table; default
||The data variables|
||Test type, defining how each variable is treated in the calculation of the similarity between each unit (
||Range of possible values of each variable; if omitted, the observed range is taken|
HLIST lists the values of the data matrix in a condensed form, either in their original order or, more usefully, in the order determined by a cluster analysis (see
HCLUSTER). This representation can be very helpful for revealing patterns in the data, associated with clusters, or for an initial scan of the data to pick out interesting features of the variables.
DATA parameter specifies a list of variates or factors, all of which must be of the same length. The
TEST parameter specifies a list of strings, one for each variate or factor in the
DATA parameter list, to define the “type” of each one. This is similar to the
TEST parameter used in
FSIMILARITY to determine how differences in variate or factor values for each unit contribute to the overall similarity between units. However,
HLIST distinguishes only between qualitative variables (factors or variates with settings
simplematching - rogerstanimoto) and quantitative variables (variates with other settings). The values of qualitative variates are printed directly. If the range of a quantitative variate is greater than 10, the printed values are scaled to lie in the range 0 to 10. This scaling is done by subtracting the minimum value, dividing by the range and then multiplying by 10. If the range is less than 10, the values are printed unscaled; so quantitative variates with values that are all less than 1 will appear as 0 in the abbreviated table. The values are printed with no decimal places, and in a field-width of 3.
RANGE parameter contains a list of scalars, one for each variable in the
DATA list. This allows you to check that the values of each variable lie within the given range. The range is also used to standardize quantitative variates, so that you can impose a standard range for example when variates are measured on commensurate scales. You can omit the
RANGE parameter for all or any of the variables by giving a missing identifier or a scalar with a missing value; Genstat then uses the observed range.
UNITS option allows you to change the labelling of the units in the table; you can specify a text or a pointer or a variate.
You can use the
GROUPS option to specify a factor that will split the units into groups. The table from
HLIST is then divided into sections corresponding to the groups. If the factor has labels, these are used to annotate the sections; otherwise a group number is used.
You can restrict any of the
DATA variates or factors to list only a subset of the units. If more than one of these is restricted, then they must all be restricted to the same set of units.
Commands for: Multivariate and cluster analysis.
" Genstat example HCLU-1: Cluster analysis Data from 'Observers Book of Automobiles', 1986 16 Italian cars and 10 measurements: 1. engine capacity c.c. CC 2. number of cylinders NCyl 3. fuel tank litres Tank 4. unladen weight kg Wt 5. length cm Length 6. width cm Width 7. height cm Ht 8. wheelbase cm Wbase 9. top speed kph TSpeed 10. time to 100kph secs StSt 11. carburettor/inj/diesel 1/2/3 Carb 12. front/rear wheel drive 1/2 Drive " TEXT [VALUES=Estate,'Arna1.5','Alfa2.5',Mondialqc,Testarossa,Croma,\ Panda,Regatta,Regattad,Uno,X19,Contach,Delta,Thema,Y10,Spider] Cars POINTER [VALUES=CC,NCyl,Tank,Wt,Length,Width,Ht,WBase,TSpeed,StSt,\ Carb,Drive] Vars " Read the data - measurements and carnames - from the file 'HCLU-1.DAT', and then display it." OPEN '%gendir%/examples/HCLU-1.DAT'; CHANNEL=cardat READ [CHANNEL=cardat] Vars CLOSE cardat " Treat the number of cylinders, data, differently to the continuous measurements." HLIST [UNITS=Cars] \ Vars; TEST=4(cityblock,euclidean),2(cityblock,simplematching) " Form a hierarchical clustering of the cars, using the single linkage method." SYMMETRIC [ROWS=Cars] CarSim FSIMILARITY [SIMILARITY=CarSim]\ Vars; TEST=4(cityblock,euclidean),2(cityblock,simplematching) HCLUSTER [PRINT=amalgamations; METHOD=single] CarSim " Use the average-linkage method." HCLUSTER [PRINT=dendrogram; METHOD=average] CarSim;\ AMALGAMATIONS=Am; PERMUTATION=Perm " Display a high-resolution dendrogram." DDENDROGRAM [ORDERING=given] DATA=Am; PERMUTATION=Perm; LABELS=Cars;\ TITLE='Italian cars clustered by average linkage'