0

Using arules package, 'apriori' returns a 'rules' object.

How can we make a query that - What exact column does the item(s) in rules {lhs, rhs} come from ?

Example:

I've some data in a tabular manner in file "input.csv" and want to associate/interpret the returned rule itemsets with the column headers in the file. How can I possibly do that?

Any pointers are appreciated. Thanks,



A reproducible example:
input.csv

ABC,DEF,GHI,JKL,MNO
11,56789,1,0,10
12,57685,0,0,10
11,56789,0,1,11
10,57689,1,0,12
11,56789,0,1,12
10,57685,1,0,12
10,57689,1,0,10
11,56789,0,1,12
11,56789,0,0,10
11,56789,0,0,10
11,56789,0,1,10
11,56789,0,0,10

Call to Apriori :

transactions <- read.transactions("input.csv", format="basket", sep = ',', cols = NULL,  rm.duplicates = TRUE)
Rules <- apriori(transactions, parameter = list(supp = 0.45, conf = 0.50, target = "rules"))

Returned result:

> inspect(Rules)
   lhs        rhs       support confidence     lift
1  {}      => {11}    0.6153846  0.6153846 1.000000
2  {}      => {56789} 0.6153846  0.6153846 1.000000
3  {}      => {1}     0.6153846  0.6153846 1.000000
4  {}      => {10}    0.6923077  0.6923077 1.000000
5  {}      => {0}     0.9230769  0.9230769 1.000000
6  {11}    => {56789} 0.6153846  1.0000000 1.625000
7  {56789} => {11}    0.6153846  1.0000000 1.625000
8  {11}    => {0}     0.6153846  1.0000000 1.083333
9  {0}     => {11}    0.6153846  0.6666667 1.083333
10 {56789} => {0}     0.6153846  1.0000000 1.083333
11 {0}     => {56789} 0.6153846  0.6666667 1.083333
12 {1}     => {0}     0.6153846  1.0000000 1.083333
13 {0}     => {1}     0.6153846  0.6666667 1.083333
14 {10}    => {0}     0.6923077  1.0000000 1.083333
15 {0}     => {10}    0.6923077  0.7500000 1.083333
16 {11, 56789} => {0}     0.6153846  1.0000000 1.083333
17 {0, 11}    => {56789} 0.6153846  1.0000000 1.625000
18 {0, 56789} => {11}    0.6153846  1.0000000 1.625000

Now, I want to make a distinction between the items of say, rule No.13

13 {0} => {1} 0.6153846 0.6666667 1.083333

{0} => {1} means, a value of 0 in dimension "GHI" implies a value of 1 in "JKL" or vice versa ?

so, Is there a way we can get the column name/id of the values of itemsets returned in rules object ?

srbhkmr
  • 2,074
  • 1
  • 14
  • 19
  • 2
    Can you post a small example? – Roman Luštrik May 06 '13 at 12:47
  • @RomanLuštrik : I've added a small example. Thanks for your interest. – srbhkmr May 06 '13 at 12:58
  • @SimonO101 : Thanks, I've added the code that i'm executing. I can't find any method in `rules` class which gives me back the column name/id to which these itemset values belong to. – srbhkmr May 06 '13 at 13:12
  • @srbhkmr we need some example code that *we* can run to make an similar object. We don't have access to `input.csv`, therefore we can't run it. It doesn't have to be the exact same one as your, just one that adequately illustrates your problem. Please see this guide on [**reproducible examples**](http://stackoverflow.com/q/5963269/1478381). thanks. – Simon O'Hanlon May 06 '13 at 13:20
  • @SimonO101 : Yes, I'll come up with a short reproducible example soon, but the gist of this is - Two columns from my dataset share values from same domain, and I want to distinct them when I get them in rules itemsets. – srbhkmr May 06 '13 at 13:24
  • @SimonO101 : I've added a small reproducible example. Hope it makes my question clear now. Thanks, – srbhkmr May 06 '13 at 13:59

1 Answers1

0

lhs = Left Hand Side, rhs = Right Hand Side

To be read as lhs => rhs.

{0} => {1} means: if the transaction contains a 0, it also has a 1 somewhere.

However, as you have not preprocessed your data appropriately, the results are meaningless. You data definitely does not look like basket input format to me.

Erich Schubert
  • 8,575
  • 2
  • 26
  • 42
  • Could you please elaborate a little, what kind of pre-processing have I missed? or why it's not a proper basket input format? Thanks, – srbhkmr May 07 '13 at 09:43
  • Look at the `R` examples. AFAICT, the input format is to read "Spaghetti,Tomato,Basil" in a single line, if a customer bought these three items, and "ToiletPaper" if he bought nothing else than that... Did your customers all buy 6 items, and why is item "0" so popular? – Erich Schubert May 07 '13 at 12:44