0

I'm trying to extract random forest rules as text.

dat1 <- readRDS("model_2020-03-01_12.rds")
Features <- c(rownames(dat1$importanceSD[,0]))

featMarks <- c()

for(i in 1:length(Features )){
  featMarks[i] <- paste("X[,",i,"]",sep="")
}

treeList <- RF2List(dat1)
ruleExec <- extractRules(treeList,dataSet,digits=4,ntree=1000)
ruleExec <- unique(ruleExec)
ruleMetric <- getRuleMetric(ruleExec, dataSet, dataSet$Class)

rulesText <- c()

for(i in 1:nrow(ruleMetric)){  
  rulesText[i] <- paste("if ", ruleMetric[i,"condition"]," then ",ruleMetric[i,"pred"],sep="")
}

for(k in 1:length(Features)){
  pattern <- featMarks[k]
  replacement <-  Features[k]
  rulesText <- str_replace(rulesText, pattern, replacement)    
}

For some reason the output I get is the same that went in.

I tried to run it line by line, and it also didn't work. gsub(pattern, replacement,rulesText) returned the same result. Just to clarify the line by line part:

ruleLine <- paste("if ", ruleMetric[1,"condition"]," then ",ruleMetric[1,"pred"],sep="") 

//ruleLine is "if X[,4]<=1.5 & X[,7]<=4.5 & X[,10]<=18 then g" 


for(k in 1:length(Features)){   
pattern <- featMarks[k]   
replacement <-  Features[k]   
ruleLine <- str_replace(ruleLine, pattern, replacement)     
} 
//ruleLine is still "if X[,4]<=1.5 & X[,7]<=4.5 & X[,10]<=18 then g"

What am I missing?

Or Amit
  • 21
  • 3
  • I see a bunch of `for`-loops and I think to myself: "I guess there are better ways." Please show an example of your input data and your desired output based on that data. – Martin Gal Jun 22 '20 at 14:11
  • You wrote: "ruleLine is still..." but what did you expect? – Martin Gal Jun 22 '20 at 14:12
  • @MartinGal I use multiple ```for``` loops to try and identify the problem. Also this script is only used to extract the randomForest model to text, so it doesn't need to be super-efficient. My input data is standard RF input data using multiple indicators (feat1, feat2, ... feat20) and Class column ("g" or "b"). My desired output should be a list of conditions: "if feat4<=1.5 & feat7<=4.5 & feat10<=18 then g" – Or Amit Jun 22 '20 at 14:17
  • @MartinGal I expected the ```str_replace``` to replace ```X[,4]``` with ```feat4``` – Or Amit Jun 22 '20 at 14:18
  • Ah, okay. Try: `paste("X\\[,",i,"\\]",sep="")` since `str_replace` uses regular expressions and `[` is a special character. – Martin Gal Jun 22 '20 at 14:24
  • Thanks! It worked. I wouldn't have thought of that – Or Amit Jun 22 '20 at 14:27
  • Btw: Instead of looping you could use: `featMarks <- paste("X\\[,",1:length(Features),"\\]",sep="")` or `paste0("X\\[,",1:length(Features),"\\]")`. – Martin Gal Jun 22 '20 at 14:29

0 Answers0