Derives association rules from transaction data.
Options
PRINT = string tokens |
Controls printed output (rules ); default rule |
---|---|
METHOD = string tokens |
What to use to calculate the support of a rule (allitems , antecedent ); default ante |
MINSUPPORT = scalar |
Minimum amount of support for a rule to be included; default 0.1 |
MINCONFIDENCE = scalar |
Minimum amount of confidence for a rule to be included; default 0.8 |
MAXITEMS = scalar |
Maximum number of items that a rule may contain; default 10 |
MAXRULES = scalar |
Maximum number of rules to generate; default 100 |
Parameters
ITEMS = factors |
Items in the transactions |
---|---|
TRANSACTIONS = factors |
Specifies the transaction to which each each item belongs |
NRULES = scalars |
Saves the number of rules that have been derived |
RULES = pointers |
Pointer to factors, each of which saves the antecedent items and then the consequent item in one of the rules |
SUPPORT = variates |
Saves the support values for the rules |
CONFIDENCE = variates |
Saves the confidence values for the rules |
Description
ASRULES
examines a set of “transaction data” to derive rules of the form: “if a transaction contains items a1 … am, then it is likely also to contain item c“. The items a1 … am are known as the antecedent set, and the item c is known as the consequent item.
The data are specified in a pair of factors, using the ITEMS
and TRANSACTIONS
parameters. ITEMS
specifies the items involved in (all) the transactions, and TRANSACTIONS
specifies the transaction to which each item belongs. The data must be provided in sorted order, one transaction at a time and the items within each transaction in ascending order. For example
Items 2 3 5 1 6 7 8 4 7 9 10 1 3 4 6 8 ...
Transactions 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 4 ...
You can do this with the SORT
directive. For example, if the transactions factor is Trans
and the items factor is Items
, the command would be
SORT [INDEX=Trans,Items] Trans,Items
ASRULES
finds sets of items that occur frequently together within the transactions, and then examines these to derive the rules.
The support of a set of items is the proportion of the transactions that contains them. To avoid presenting rules that have little justification, the MINSUPPORT
option defines a minimum value for the support of a rule for it to be included (default 0.1). The METHOD
option controls whether the support is defined to be the support for all the items in the rule, or only of its antecedent items (the default).
The confidence of a rule, is the proportion of those transactions that contain the antecedent set of items that also contains the consequent set. The MINCONFIDENCE
option a minimum value for the confidence of a rule for it to be included (default 0.8).
The MAXITEMS
option sets a maximum limit on the number of items that a rule may contain (default 10), and the MAXRULES
option specifies the maximum number of rules that may be generated (default 100).
By default the rules are printed, with their support and confidence values. However, this can be suppressed by setting option PRINT=*
.
The number of rules that have been derived can be saved, in a scalar, using the NRULES
parameter. The rules themselves can be saved using the RULES
parameter, in a pointer to a set of factors. Each factor contains the the antecedent items and then the conseqent item for a particular rule. The SUPPORT
parameter can save a variate with a unit for each rule, containing its support. The CONFIDENCE
parameter similarly saves the confidence of the rules.
Options: PRINT
, METHOD
, MINSUPPORT
, MINCONFIDENCE
, MAXITEMS
, MAXRULES
.
Parameters: ITEMS
, TRANSACTIONS
, NRULES
, RULES
, SUPPORT
, CONFIDENCE
.
Method
ASRULES
uses the function nagdmc_assoc
from the Numerical Algorithms Group’s library of Data Mining Components (DMCs).
Action with RESTRICT
ITEMS
and TRANSACTIONS
may be restricted to derive the rules from only a subset of the data.
See also
Procedure: KNEARESTNEIGHBOURS
.
Commands for: Data mining.
Example
CAPTION 'ASRULES example'; STYLE=meta " Example from NAG Data Mining Components " FACTOR [LEVELS=5; VALUES=3,4,4,5, 2,3,4, 1,4,5, 1,2,2,2,2,3,4, 2,3,\ 1,2,4, 4,5, 1,2,3,4, 1,2,3, 1,2,3] Items FACTOR [LEVELS=10; VALUES=4(1),3(2),3(3),7(4),2(5),3(6),2(7),4(8),3(9),3(10)]\ Transactions ASRULES Items; TRANSACTIONS=Transactions; NRULES=Nr; RULES=Rules;\ SUPPORT=Support; CONFIDENCE=Confidence PRINT Nr FOR [NTIMES=NVALUES(Rules); INDEX=i] PRINT Rules[i] & Support$[i],Confidence$[i] ENDFOR