Fills holes within clusters of points in multi-dimensional space (R.W. Payne).
Options
PRINT = string tokens |
Controls printed output (cellclusters ); default * .e. none |
DIAGONALS = string token |
Whether to include diagonal cells (include , exclude ); default incl |
DISTANCE = scalar |
Maximum distance between cells and adjacent cells; default 1 |
NUNCLASSIFIED = scalar |
How many adjacent cells may be unclassified; default 0 |
NNEWCELLS = scalar |
Saves the number of cells that have been added to clusters |
Parameters
CELLCLUSTERS = tables |
Clusters of cells containing holes to be filled |
NEWCELLCLUSTERS = tables |
Clusters with filled holes; if unset, the CELLCLUSTERS table itself is updated |
Description
The PTFILLCLUSTERS
procedure can be used to fill holes within the clusters produced by the PTFCLUSTERS
procedure. That procedure partitions a multi-dimensional space into cells, and then clusters the cells according to the density of the points that they contain. The clusterings of cells can be saved in a table classified by factors indexing the dimensions of the space. This contains either a cluster number, or missing values for cells that have not been allocated to any cluster. PTFILLCLUSTERS
finds unallocated cells that are adjacent to the cells of a cluster, and allocates them to that cluster.
The DISTANCE
option specifies how close the cells need to be for them to be classed as adjacent. The default of one indicates that they must be alongside each other. Setting DISTANCE=2
means that there can be an intervening cell, and so on. The default is to include cells that are diagonal to each other, but you can set option DIAGONALS=exclude
to exclude these. The NUNCLASSIFIED
option specifies how many cells adjacent to an unclassified cell may also be unclassified. This means that cells that are not completely surrounded by a cluster (like cells in am indentation at the edge of the cluster) can still be allocated to the cluster.
The CELLCLUSTERS
parameter specifies the tables containing holes to be filled. The new tables can be saved using the NEWCELLCLUSTERS
parameter. If this is unset, the CELLCLUSTERS
tables are updated. The NNEWCELLS
option can save a scalar containing the number of unallocated cells that now belong to a cluster. You can print the updated tables by setting option PRINT=cellclusters
.
Options: PRINT
, DIAGONALS
, DISTANCE
, NUNCLASSIFIED
, NNEWCELLS
.
Parameters: CELLCLUSTERS
, NEWCELLCLUSTERS
.
Method
PTFCLUSTERS
uses the NEIGHBOURS
procedure to find the cells neighbouring the clusters. It then uses the ADJACENTCELLS
procedure to find the cells adjacent to these cells.
See also
Procedures: ADJACENTCELLS
, NEIGHBOURS
, PCPCLUSTER
, PTFCLUSTERS
.
Commands for: Multivariate and cluster analysis, Spatial statistics.
Example
CAPTION 'PTFILLCLUSTERS example'; STYLE=meta FACTOR [LEVELS=15] rows & [LEVELS=10] columns TABLE [CLASS=rows,columns] cellclusters READ cellclusters * * * * * * * 1 1 * * * * 3 3 * * 1 1 1 * * * * * * * 1 1 * * 2 2 2 * * * * * * * 2 * 2 * * * * * * * 2 * 2 * * * * * * * 2 * 2 * * * * * * * 2 2 2 * * * * * * * 2 * 2 * * * * * * * 2 2 2 * * * * * * * * 2 2 2 * * * * * * 2 2 2 2 * * 4 * * * 2 2 2 2 * 4 * 4 * * 2 * * 2 * * 4 * * * 2 * * 2 * * * * * : PTFILLCLUSTERS [PRINT=cellclusters]\ cellclusters; NEWCELLCLUSTERS=newclus PTFILLCLUSTERS [PRINT=cellclusters; DIAGONALS=exclude]\ cellclusters; NEWCELLCLUSTERS=newclus PTFILLCLUSTERS [PRINT=cellclusters; DISTANCE=2; NUNCLASSIFIED=1]\ cellclusters; NEWCELLCLUSTERS=newclus