Derives association rules from transaction data.
|Controls printed output (
||What to use to calculate the support of a rule (
||Minimum amount of support for a rule to be included; default 0.1|
||Minimum amount of confidence for a rule to be included; default 0.8|
||Maximum number of items that a rule may contain; default 10|
||Maximum number of rules to generate; default 100|
||Items in the transactions|
||Specifies the transaction to which each each item belongs|
||Saves the number of rules that have been derived|
||Pointer to factors, each of which saves the antecedent items and then the consequent item in one of the rules|
||Saves the support values for the rules|
||Saves the confidence values for the rules|
ASRULES examines a set of “transaction data” to derive rules of the form: “if a transaction contains items a1 … am, then it is likely also to contain item c“. The items a1 … am are known as the antecedent set, and the item c is known as the consequent item.
The data are specified in a pair of factors, using the
ITEMS specifies the items involved in (all) the transactions, and
TRANSACTIONS specifies the transaction to which each item belongs. The data must be provided in sorted order, one transaction at a time and the items within each transaction in ascending order. For example
Items 2 3 5 1 6 7 8 4 7 9 10 1 3 4 6 8 ...
Transactions 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 4 ...
You can do this with the
SORT directive. For example, if the transactions factor is
Trans and the items factor is
Items, the command would be
SORT [INDEX=Trans,Items] Trans,Items
ASRULES finds sets of items that occur frequently together within the transactions, and then examines these to derive the rules.
The support of a set of items is the proportion of the transactions that contains them. To avoid presenting rules that have little justification, the
MINSUPPORT option defines a minimum value for the support of a rule for it to be included (default 0.1). The
METHOD option controls whether the support is defined to be the support for all the items in the rule, or only of its antecedent items (the default).
The confidence of a rule, is the proportion of those transactions that contain the antecedent set of items that also contains the consequent set. The
MINCONFIDENCE option a minimum value for the confidence of a rule for it to be included (default 0.8).
MAXITEMS option sets a maximum limit on the number of items that a rule may contain (default 10), and the
MAXRULES option specifies the maximum number of rules that may be generated (default 100).
By default the rules are printed, with their support and confidence values. However, this can be suppressed by setting option
The number of rules that have been derived can be saved, in a scalar, using the
NRULES parameter. The rules themselves can be saved using the
RULES parameter, in a pointer to a set of factors. Each factor contains the the antecedent items and then the conseqent item for a particular rule. The
SUPPORT parameter can save a variate with a unit for each rule, containing its support. The
CONFIDENCE parameter similarly saves the confidence of the rules.
ASRULES uses the function
nagdmc_assoc from the Numerical Algorithms Group’s library of Data Mining Components (DMCs).
TRANSACTIONS may be restricted to derive the rules from only a subset of the data.
Commands for: Data mining.
CAPTION 'ASRULES example'; STYLE=meta " Example from NAG Data Mining Components " FACTOR [LEVELS=5; VALUES=3,4,4,5, 2,3,4, 1,4,5, 1,2,2,2,2,3,4, 2,3,\ 1,2,4, 4,5, 1,2,3,4, 1,2,3, 1,2,3] Items FACTOR [LEVELS=10; VALUES=4(1),3(2),3(3),7(4),2(5),3(6),2(7),4(8),3(9),3(10)]\ Transactions ASRULES Items; TRANSACTIONS=Transactions; NRULES=Nr; RULES=Rules;\ SUPPORT=Support; CONFIDENCE=Confidence PRINT Nr FOR [NTIMES=NVALUES(Rules); INDEX=i] PRINT Rules[i] & Support$[i],Confidence$[i] ENDFOR