0

I am working on a transaction data for Market Basket Analysis which has below mention table format:

Id Product    
 1  Prod A    
 1  Prod B    
 1  Prod C    
 1  Prod D   
 2  Prod A    
 2  Prod B

I want to convert the layout of the data so that the apriori algorithm can work, taking the data as single transaction data. So for the purpose, I want to convert the data in to the following format:

Id Column1 Column2 Column3 Column3    
 1  Prod A  Prod B  Prod C  Prod D    
 2  Prod A  Prod B
  1. Can anyone help me with a way to convert this data in R or Excel?

  2. Will this data work for running the apriori algorithm in R (hope it will work)?

rmuc8
  • 2,869
  • 7
  • 27
  • 36
Monish
  • 5
  • 1

2 Answers2

3

Use dcast of reshape2 package in R:

df <- data.frame(Id=c(1,1,1,1,2,2), Product=c("Prod A", "Prod B", "Prod C", "Prod D", "Prod A", "Prod B"))

library(reshape2)
dcast(df, Id~Product, value.var="Product")
#    Id Prod A Prod B Prod C Prod D
#  1  1 Prod A Prod B Prod C Prod D
#  2  2 Prod A Prod B   <NA>   <NA>
StrikeR
  • 1,598
  • 5
  • 18
  • 35
0
ID <- c(1,1,1,1,2,2)
Product <- c("Prod A","Prod B","Prod C","Prod D","Prod A","Prod B")
df <- data.frame (ID, Product)

You can create dummies for step 2 using

> xtabs(~ID  +Product, df)

 ID Prod A Prod B Prod C Prod D
  1      1      1      1      1
  2      1      1      0      0

In a second step, you can use the package arules

rmuc8
  • 2,869
  • 7
  • 27
  • 36