1

I am dealing with the R package arules. I have a csv file with 6 columns named respectively: item1,item2,item3,item4,item5,item6. Each cell represents an item in a basket and each row the whole basket for a transaction. The problem is that after reading the csv file as :

data <- read.csv('file.csv')

and after turning it into transactions:

trans <- as(data, "transactions")

I find that those cells that are empty are considered as items under the name i.e. 'itme3='. Is there a way to specify that empty cells have to be ignored or is it possible to eliminate certain items from an R transaction data?

Blue Moon
  • 4,421
  • 20
  • 52
  • 91
  • Could you melt the data before reading it as transactions? This would be an easy way to eliminate missing data. – effel Mar 10 '16 at 17:12
  • I know there is a function read.transactions. However, could you show me how the data needs to be arranged in order to be read correctly? – Blue Moon Mar 10 '16 at 17:14
  • Afraid I don't work with transactions data, but I'd try: library(reshape2); d= melt(data, id.vars = NULL); d= d[!is.na(value), ]; as.transactions(d) – effel Mar 10 '16 at 17:18
  • It would be easier to help you if you provided a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Provide sample input data and the desired output for that data. – MrFlick Mar 10 '16 at 17:23

1 Answers1

0

I dont think the code you have used for trans will work.

You can try doing this. Arrange your data in a two column format. akin

  1. User1: a
  2. User2: b
  3. User1: c
  4. USer1: a
  5. User2: d
  6. User2: b

After doing this remove duplicate rows. In above sample it will be row 2 and 6. And then you can use the package's code to transpose which is:

#Transposing data to run algorithm
trans1 = split(mydate$product, mydata$user_id,"transactions")

So when you run the above split code, the result will be an object, not a dataframe. And then you can go on running the apriori.

Isha Dua
  • 11
  • 1
  • 3