0

I am using the arules package in R to compute the rules for a data frame having 850 columns and 1335 rows which majorly contains logical, numeric and character data we have converted the logical and character into factor.

When i set a subset (almost any) of lhs for generating the rules by using the following code the following issue is encountered. Please note that the whole set has 12,50,388 rules:

apriori(newdata,parameter=list(minlen=1,maxlen=4,supp=0.6,conf=0.6,originalSupport=FALSE,ext=TRUE),
appearance= list(lhs=lhs1,rhs=rhs1,default="none"),
control = list(memopt=TRUE,load=FALSE))

The r-studio works the first time but from the subsequent times it doesn't work, screenshot here.

Second time and after a long time the session fails, screenshot here.

A reproducible example:

library("arules")
data("Adult")
ndat<- as(Adult,"transactions")

#List of items
item1<-ndat@itemInfo

#Defining a target column
tar_var<-colnames(Adult)[ncol(Adult)]

#Defining LHS and RHS
rhs1<-item1$labels[item1$variables==tar_var]
lhs1<-item1$labels[item1$variables!=tar_var]

#taking a sample of half the length of the original data 
lhs2<-sample(lhs1,round(length(lhs1)/2,0))

#Code for generating rules that kills the session
system.time(rule_gen<- apriori(ndat,parameter=list(minlen=1,maxlen=2+1,supp=0,conf=0,originalSupport=FALSE,ext=TRUE),
                          appearance= list(lhs=lhs2,default="rhs"),control = list(memopt=TRUE,load=FALSE)) )

please help me out, i am using a windows PC, 4 GB ram. Thanks in advance.

Carrosive
  • 889
  • 2
  • 10
  • 25
swarup
  • 11
  • 1
  • 2
  • Did you try to use a subset of your data? Is then always reproducible? Did you try anything else? Please read [(1)](http://stackoverflow.com/help/how-to-ask) how do I ask a good question, [(2)](http://stackoverflow.com/help/mcve) How to create a MCVE as well as [(3)](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#answer-5963610) how to provide a minimal reproducible example in R. – Christoph Jul 25 '16 at 08:42
  • I have used a subset that is half of the data, my data contains german characters like umlaut. does it make a difference? and the initial encoding is not UTF-8, i have encoded it in UTF-8 in read.csv – swarup Jul 25 '16 at 09:18
  • We need data and code to reproduce the problem (see Christoph's comment). – Michael Hahsler Jul 28 '16 at 16:34
  • please refer to the Reproducible example – swarup Aug 01 '16 at 07:52

0 Answers0