Create a dataframe from a vector of numbers in R

Question

I am trying to mine frequent itemsets and association rules from data which is in a .CSV file. Learnt about the arules package in R and decided to use it.

Facing problem with the creation of dataframe from the CSV.

My CSV file essentially has the data in the following format:

transactionid,items
1,"milk,beer,diapers"
2,"coke,milk,eggs"
3,"diapers,eggs,coke"

Could anyone help me with the creation of dataframe to pass it to the apriori() or elact() functions of the arules library?

Thanks!

I'm guessing he also wants to split the items. So adapting from [here](http://stackoverflow.com/questions/7069076/split-column-at-delimiter-in-data-frame): `df <- read.csv("test.csv", stringsAsFactors = FALSE)` and then `cbind(df[,1, F], with(df, data.frame(do.call(rbind, strsplit(items, ',', fixed=TRUE)))))` If the number of items isn't constant then it's probably better to use `separate` from `tidyr` or something similar. — Molx, Sep 27 '15 at 01:28

jlhoward · Answer 1 · 2015-09-27T05:51:05.680

It sounds like you want to import data from a csv file into a transactions object.

df <- read.csv(text='transactionid,items
               1,"milk,beer,diapers"
               2,"coke,milk,eggs"
               3,"diapers,eggs,coke"',
               stringsAsFactors=FALSE)

library(arules)
lst        <- lapply(df$items,function(x)strsplit(x,split=",")[[1]])
names(lst) <- df$transactionid
trans      <- as(lst,"transactions")
inspect(trans)
#   items     transactionID
# 1 {beer,                 
#    diapers,              
#    milk}                1
# 2 {coke,                 
#    eggs,                 
#    milk}                2
# 3 {coke,                 
#    diapers,              
#    eggs}                3

You should also take a look at the read.transactions(...) function.

Create a dataframe from a vector of numbers in R

1 Answers1