0

I have implemented Apriori algorithm on my dataset and filtered it successfully for getting only positive results. The rules I get though are inverted repititions that is:

   lhs             rhs              support confidence      lift
1  {Mobile=1}   => {Earphone=1} 0.025563474 0.09925997 0.3808200
2  {Earphone=1} => {Mobile=1}   0.025563474 0.09807662 0.3808200
3  {Jeans=1}    => {Shirt=1}    0.024030494 0.09389671 0.3637123
4  {Shirt=1}    => {Jeans=1}    0.024030494 0.09308297 0.3637123

As can be seen rule 1 & rule 2 are the same just the LHS & RHS are interchanged. Is there any way to remove such rules from the final result?

My code is:

transactions <- as(sold_data, "transactions");
rules = apriori(transactions, parameter=list(support=0.001, confidence=0.05));

rules_subset <- sort(
  subset(rules, 
         (lhs %in% c("Mobile=1", "Earphone=1", "Watch=1", "Jeans=1", "Shirt=1")) &
        !(lhs %in% c("Mobile=0", "Earphone=0", "Watch=0", "Jeans=0", "Shirt=0")) &
         (rhs %in% c("Mobile=1", "Earphone=1", "Watch=1", "Jeans=1", "Shirt=1"))
  ),
  decreasing = TRUE,
  by = "support"
);

inspect(subset_rules);
Nimantha
  • 6,405
  • 6
  • 28
  • 69
Ram G Athreya
  • 4,892
  • 6
  • 25
  • 57
  • Does the data contain any of the `=>`? If not, can you delete them, it is only adding confusion. – flodel May 31 '14 at 17:14
  • Rows `1` and `2` have different values for the other columns: support, confidence, lift. What are we supposed to do here: select any of the two rows, the first one, do an average? – flodel May 31 '14 at 17:16
  • Yes the output does have => I have pasted the console output I got from running my script – Ram G Athreya May 31 '14 at 18:05
  • @RamGAthreya sorry, have you find the right solutions? I have the same problem and I can not solve it! – Lorenzo Benassi Dec 21 '17 at 15:07
  • @LorenzoBenassi can't say I have. I just did some hacks during that time to get the result I want. – Ram G Athreya Dec 21 '17 at 20:46

1 Answers1

1

What you are referring to is called redundant rules. These rules can be easily removed with code below.

transactions <- as(sold_data, "transactions")
rules = apriori(transactions, parameter=list(support=0.001, confidence=0.05))
subset.matrix <- is.subset(rules,rules)
subset.matrix[lower.tri(subset.matrix,diag=T)] <- NA
redundant <- colSums(subset.matrix,na.rm=T) >= 1
rules <- rules[!redundant]