Fills holes within clusters of points in multi-dimensional space (R.W. Payne).
Options
PRINT = string tokens |
Controls printed output (cellclusters); default * .e. none |
DIAGONALS = string token |
Whether to include diagonal cells (include, exclude); default incl |
DISTANCE = scalar |
Maximum distance between cells and adjacent cells; default 1 |
NUNCLASSIFIED = scalar |
How many adjacent cells may be unclassified; default 0 |
NNEWCELLS = scalar |
Saves the number of cells that have been added to clusters |
Parameters
CELLCLUSTERS = tables |
Clusters of cells containing holes to be filled |
NEWCELLCLUSTERS = tables |
Clusters with filled holes; if unset, the CELLCLUSTERS table itself is updated |
Description
The PTFILLCLUSTERS procedure can be used to fill holes within the clusters produced by the PTFCLUSTERS procedure. That procedure partitions a multi-dimensional space into cells, and then clusters the cells according to the density of the points that they contain. The clusterings of cells can be saved in a table classified by factors indexing the dimensions of the space. This contains either a cluster number, or missing values for cells that have not been allocated to any cluster. PTFILLCLUSTERS finds unallocated cells that are adjacent to the cells of a cluster, and allocates them to that cluster.
The DISTANCE option specifies how close the cells need to be for them to be classed as adjacent. The default of one indicates that they must be alongside each other. Setting DISTANCE=2 means that there can be an intervening cell, and so on. The default is to include cells that are diagonal to each other, but you can set option DIAGONALS=exclude to exclude these. The NUNCLASSIFIED option specifies how many cells adjacent to an unclassified cell may also be unclassified. This means that cells that are not completely surrounded by a cluster (like cells in am indentation at the edge of the cluster) can still be allocated to the cluster.
The CELLCLUSTERS parameter specifies the tables containing holes to be filled. The new tables can be saved using the NEWCELLCLUSTERS parameter. If this is unset, the CELLCLUSTERS tables are updated. The NNEWCELLS option can save a scalar containing the number of unallocated cells that now belong to a cluster. You can print the updated tables by setting option PRINT=cellclusters.
Options: PRINT, DIAGONALS, DISTANCE, NUNCLASSIFIED, NNEWCELLS.
Parameters: CELLCLUSTERS, NEWCELLCLUSTERS.
Method
PTFCLUSTERS uses the NEIGHBOURS procedure to find the cells neighbouring the clusters. It then uses the ADJACENTCELLS procedure to find the cells adjacent to these cells.
See also
Procedures: ADJACENTCELLS, NEIGHBOURS, PCPCLUSTER, PTFCLUSTERS.
Commands for: Multivariate and cluster analysis, Spatial statistics.
Example
CAPTION 'PTFILLCLUSTERS example'; STYLE=meta
FACTOR [LEVELS=15] rows
& [LEVELS=10] columns
TABLE [CLASS=rows,columns] cellclusters
READ cellclusters
* * * * * * * 1 1 *
* * * 3 3 * * 1 1 1
* * * * * * * 1 1 *
* 2 2 2 * * * * * *
* 2 * 2 * * * * * *
* 2 * 2 * * * * * *
* 2 * 2 * * * * * *
* 2 2 2 * * * * * *
* 2 * 2 * * * * * *
* 2 2 2 * * * * * *
* * 2 2 2 * * * * *
* 2 2 2 2 * * 4 * *
* 2 2 2 2 * 4 * 4 *
* 2 * * 2 * * 4 * *
* 2 * * 2 * * * * * :
PTFILLCLUSTERS [PRINT=cellclusters]\
cellclusters; NEWCELLCLUSTERS=newclus
PTFILLCLUSTERS [PRINT=cellclusters; DIAGONALS=exclude]\
cellclusters; NEWCELLCLUSTERS=newclus
PTFILLCLUSTERS [PRINT=cellclusters; DISTANCE=2; NUNCLASSIFIED=1]\
cellclusters; NEWCELLCLUSTERS=newclus