R removing items from transactions data

Question

I am dealing with the R package arules. I have a csv file with 6 columns named respectively: item1,item2,item3,item4,item5,item6. Each cell represents an item in a basket and each row the whole basket for a transaction. The problem is that after reading the csv file as :

data <- read.csv('file.csv')

and after turning it into transactions:

trans <- as(data, "transactions")

I find that those cells that are empty are considered as items under the name i.e. 'itme3='. Is there a way to specify that empty cells have to be ignored or is it possible to eliminate certain items from an R transaction data?

Could you melt the data before reading it as transactions? This would be an easy way to eliminate missing data. — effel, Mar 10 '16 at 17:12
I know there is a function read.transactions. However, could you show me how the data needs to be arranged in order to be read correctly? — Blue Moon, Mar 10 '16 at 17:14
Afraid I don't work with transactions data, but I'd try: library(reshape2); d= melt(data, id.vars = NULL); d= d[!is.na(value), ]; as.transactions(d) — effel, Mar 10 '16 at 17:18
It would be easier to help you if you provided a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Provide sample input data and the desired output for that data. — MrFlick, Mar 10 '16 at 17:23

score 0 · Answer 1 · answered Apr 29 '17 at 14:58

I dont think the code you have used for trans will work.

You can try doing this. Arrange your data in a two column format. akin

User1: a
User2: b
User1: c
USer1: a
User2: d
User2: b

After doing this remove duplicate rows. In above sample it will be row 2 and 6. And then you can use the package's code to transpose which is:

#Transposing data to run algorithm
trans1 = split(mydate$product, mydata$user_id,"transactions")

So when you run the above split code, the result will be an object, not a dataframe. And then you can go on running the apriori.

R removing items from transactions data

1 Answers1