I have what I thought was a well-prepared dataset. I wanted to use the Apriori Algorithm in R to look for associations and come up with some rules. I have about 16,000 rows (unique customers) and 179 columns that represent various items/categories. The data looks like this:
Cat1 Cat2 Cat3 Cat4 Cat5 ... Cat179
1, 0, 0, 0, 1, ... 0
0, 0, 0, 0, 0, ... 1
0, 1, 1, 0, 0, ... 0
...
I thought having a comma separated file with binary values (1/0) for each customer and category would do the trick, but after I read in the data using:
data5 = read.csv("Z:/CUST_DM/data_test.txt",header = TRUE,sep=",")
and then run this command:
rules = apriori(data5, parameter = list(supp = .001,conf = 0.8))
I get the following error:
Error in asMethod(object):
column(s) 1, 2, 3, ...178 not logical or a factor. Discretize the columns first.
I understand Discretize but not in this context I guess. Everything is a 1 or 0. I've even changed the data from INT to CHAR and received the same error. I also had the customer ID (unique) as column 1 but I understand that isn't necessary when the data is in this form (flat file). I'm sure there is something obvious I'm missing - I'm new to R.
What am I missing? Thanks for your input.