樣本數據
a,c,e
b,d
b,c
a,b,c,d
a,b
b,c
a,b
a,b,c,e
a,b,c
a,c,e
setwd("/users/XXX/desktop/R/chapter5/示例程序")
#Matrix是arules的依賴庫
library(Matrix)
library(arules)
#下面讀txt內容可能會出錯,需要打開txt把光標移到最後一行後再換行,也就是最後一行給個空行
tr<-read.transactions("menu_orders.txt",format="basket",sep=",")
summary(tr)
transactions as itemMatrix in sparse format with
10 rows (elements/itemsets/transactions) and
5 columns (items) and a density of 0.54
most frequent items:#各個元素的頻數
b a c e d (Other)
8 7 7 3 2 0
element (itemset/transaction) length distribution:
sizes
2 3 4
5 3 2
Min. 1st Qu. Median Mean 3rd Qu. Max.
2.0 2.0 2.5 2.7 3.0 4.0
includes extended item information - examples:
labels
1 a
2 b
3 c
inspect(tr)
items
[1] {a,c,e}
[2] {b,d}
[3] {b,c}
[4] {a,b,c,d}
[5] {a,b}
[6] {b,c}
[7] {a,b}
[8] {a,b,c,e}
[9] {a,b,c}
[10] {a,c,e}
#支持度0.2 置信度0.5
rules0<-apriori(tr,parameter=list(support=0.2,confidence=0.5))
rules0
set of 18 rules
inspect(riles0)
lhs rhs support confidence lift
[1] {} => {c} 0.7 0.7000000 1.0000000
[2] {} => {b} 0.8 0.8000000 1.0000000
[3] {} => {a} 0.7 0.7000000 1.0000000
[4] {d} => {b} 0.2 1.0000000 1.2500000
[5] {e} => {c} 0.3 1.0000000 1.4285714
[6] {e} => {a} 0.3 1.0000000 1.4285714
[7] {c} => {b} 0.5 0.7142857 0.8928571
[8] {b} => {c} 0.5 0.6250000 0.8928571
[9] {c} => {a} 0.5 0.7142857 1.0204082
[10] {a} => {c} 0.5 0.7142857 1.0204082
[11] {b} => {a} 0.5 0.6250000 0.8928571
[12] {a} => {b} 0.5 0.7142857 0.8928571
[13] {c,e} => {a} 0.3 1.0000000 1.4285714
[14] {a,e} => {c} 0.3 1.0000000 1.4285714
[15] {a,c} => {e} 0.3 0.6000000 2.0000000
[16] {b,c} => {a} 0.3 0.6000000 0.8571429
[17] {a,c} => {b} 0.3 0.6000000 0.7500000
[18] {a,b} => {c} 0.3 0.6000000 0.8571429
有實際用處,比如我上次做的新聞標題分詞,然後獲得詞與詞之間的關聯度,就可以用這個