How do I analyze Market Basket Output?

Question

I have a sale data as below:

+------------+------+-------+
| Receipt ID | Item | Value |
+------------+------+-------+
|          1 | a    |     2 |
|          1 | b    |     3 |
|          1 | c    |     2 |
|          1 | k    |     4 |
|          2 | a    |     2 |
|          2 | b    |     5 |
|          2 | d    |     6 |
|          2 | k    |     7 |
|          3 | a    |     8 |
|          3 | k    |     1 |
|          3 | c    |     2 |
|          3 | q    |     3 |
|          4 | k    |     4 |
|          4 | a    |     5 |
|          5 | b    |     6 |
|          5 | a    |     7 |
|          6 | a    |     8 |
|          6 | b    |     3 |
|          6 | c    |     4 |
+------------+------+-------+

Using APriori algorithm, I modified the Rules into different columns:

For eg, I got output as below, I trimmed support, confidence, Lift value.. I am only considering rules which mapped into different columns into Target Item, Item1, Items ({Item1,Item2} -> {Target Item})

Output is as below:

+-------------+-------+-------+
| Target Item | Item1 | Item2 |
+-------------+-------+-------+
| a           | b     |       |
| a           | b     | c     |
| a           | k     |       |
+-------------+-------+-------+

I am looking to calculate the all the receipts having the rules combination and identify the Target item Sale value only in those receipts and also Combined sale value of Item 1 and Item 2 in the combination receipts:

Output should be something like below (I dont need receipt ID's from below)

+-------------+-------+-------+--------------+----------------------+------------------------------+
| Target Item | Item1 | Item2 | Receipt ID's | Value of Target Item | Remaining value(Item1+item2) |
+-------------+-------+-------+--------------+----------------------+------------------------------+
| a           | b     |       | 1,2,5,6      | 2+2+7+8              | 3+5+6+3                      |
| a           | b     | c     | 1,6          | 2                    | (3+3) + (2+4)                |
| a           | k     |       | 1,2,3,4      | 2+2+8+5              | 4+7+1+4                      |
+-------------+-------+-------+--------------+----------------------+------------------------------+

To replicate the Apriori:

library(arules)

Data <- data.frame(
  Receipt_ID = c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,5,5,6,6,6),
  item = c('a','b','c','k','a','b','d','k','a','k','c','q','k',                    'a','b','a','a', 'b', 'c'
  )
  ,
  value = c(2,3,2,4,2,5,6,7,8,1,2,3,4,5,6,7,8,3,4
  )
)


write.table(Data,"item.csv",sep=',',row.names = F)

data_frame = read.transactions(
  file = "item.csv",
  format = "single",
  sep = ",",
  cols = c("Receipt_ID","item"),
  rm.duplicates = T
) 

rules_apriori <- apriori(data_frame)


rules_apriori


rules_tab <- as(rules_apriori, "data.frame")


rules_tab

out <- strsplit(as.character(rules_tab$rules),'=>') 
rules_tab$rhs <- do.call(rbind, out)[,2]
rules_tab$lhs <- do.call(rbind, out)[,1]
rules_tab$rhs <- gsub("\\{", "", rules_tab$rhs)
rules_tab$rhs <- gsub("}", "", rules_tab$rhs) 
rules_tab$lhs = gsub("}", "", rules_tab$lhs)
rules_tab$lhs = gsub("\\{", "", rules_tab$lhs) 

rules_final <- data.frame (target_item = character(),item_combination =     character() )

rules_final <- cbind(target_item = rules_tab$rhs,item_Combination = rules_tab$lhs)

rules_final

Please provide a [reproduicble example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Don't put "boxes" around input data, that makes it much harder to re-import into R. — MrFlick, Mar 09 '17 at 16:03

How do I analyze Market Basket Output?

0 Answers0