1

I am new to R and machine learning. Here I tried to build a random forest classification model to predict the priority of a incident ticket from its description. The below steps I followed.

1) Input <- description using CSV file

library(tm)
library(SnowballC)
library(caTools)
library(randomForest)
incidents = read.csv("incident.csv", stringsAsFactors = FALSE)

> str(incidents) 'data.frame':  4265 obs. of  7 variables:  $ number                : chr  "INC0031193" "INC0037867" "INC0159979" "INC0031446" ...  $
> u_detailed_description: chr  "Close & Ignore new Ticket New-Production
> SNOW Auto Routing test for XYZ SNOW ticketing in uat"  "" "" ""...  $
> priority              : chr  "3 - Moderate" "2 - High" "4 - Low" "3 -
> Moderate" ...  $ state                 : chr  "Canceled" "Canceled"
> "Canceled" "Canceled" ...  $ category              : chr  "Server"
> "Tools" "Server" "Server" ...  $ assignment_group      : chr 
> "Windows" "Tools" "SNOC Support" "Windows" ...

2) Data cleaning, creating DocumenTermMatrix and convert to DataFrame.

incidentCorpus <- Corpus(VectorSource(incidents$u_detailed_description))
incidentCorpus <- tm_map(incidentCorpus, tolower)
incidentCorpus <- tm_map(incidentCorpus, removePunctuation)
incidentCorpus <- tm_map(incidentCorpus, removeWords, stopwords("english"))
incidentCorpus <- tm_map(incidentCorpus, stemDocument)
incidentDTM <- DocumentTermMatrix(incidentCorpus)

3) Splitting data into train and test set using caTools.

set.seed(123)
split <- sample.split(incidentSparse$priority,SplitRatio = 0.7)
train <- subset(incidentSparse, split == TRUE)
test  <- subset(incidentSparse, split == FALSE)
train$priority <- as.character(train$priority) 
train$priority <- as.factor(train$priority
test$priority  <- as.character(testSet1$priority)
test$priority  <- as.factor(testSet1$priority)

4) Apply the randomforest() function to create my model and used predict function to classify my test set as well.

incidentRandomF <- randomForest(priority ~ ., data = train, ntree = 200, mtry = 50, importance = TRUE, proximity = TRUE)

5) the overall accuracy of the model is around 90%.

baselineAccuracy <- sum(diag(table(predict(incidentRandomF, type="class"), train$priority)))/nrow(train)

> baselineAccuracy
[1] 0.8392498

predFinalTestSet_RF <- predict(incidentRandomF, newdata = test,  type="class")
FinalTestSetAccuracy <- sum(diag(table(test$priority,predFinalTestSet_RF)))/nrow(test)

> FinalTestSetAccuracy
[1] 0.8828125

As of now my classification model is ready and now I need to execute this model to predict the priority based on a given description, where the description would be provided by the user.

How to provide user input to the R script to make it functional properly?

Your help would be highly appreciated. Thanks in advance.

Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89
Sourav94
  • 13
  • 4
  • Please, provide a reproducible example: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Terru_theTerror Mar 21 '18 at 10:34
  • Thanks for your reply. I have updated the question. please check. – Sourav94 Mar 21 '18 at 11:22
  • How do the people who want predictions want to give input? You could create a web form with rShiny, or they could send you a CSV that looks like your test set, you could distribute your trained model and they could put in values through R, there are a lot of options here. – Josh Rumbut Mar 21 '18 at 11:50
  • @Josh. Thanks for your reply. I would like to create web form with ShinyR. It would be helpful if you could provide me a high level idea or any previously solved linked for the same. – Sourav94 Mar 21 '18 at 12:27

1 Answers1

1

So without writing the entire page (or testing the code I wrote here), but hopefully enough to show how to get started (let me know if you have any more questions), here's how the Shiny app will basically look:

In a file called ui.R:

fluidPage(

  # Copy the line below to make a text input box
  textInput("u_detailed_description", label = h3("Text input"), value = "Enter text..."),
  #Additional inputs for other fields here

  hr(),
  fluidRow(column(3, verbatimTextOutput("prediction")))

)

Then in server.R:

function(input, output) {

  # You can access the value of the widget with input$u_detailed_description, e.g.
  output$value <- renderPrint({ predict(incidentRandomF, newdata = input,  type="class") })

}

Tons of great information and documentation, including very many examples over at the Shiny site

While looking something else up, I stumbled on this repo that shows someone displaying predictions from a model in a Shiny app, it might help clarify how to do things like save your model and reload it and that sort of thing.

Josh Rumbut
  • 2,640
  • 2
  • 32
  • 43
  • Many thanks for the snippets and the information. I shall check and let you know the outcome. – Sourav94 Mar 21 '18 at 14:50
  • I have modified the script but getting the error **renderPrint** function. Warning: Error in eval: object 'auto' not found Stack trace (innermost first): 91: eval 90: eval 89: model.frame.default 88: model.frame 87: predict.randomForest 86: predict 85: renderPrint [D:\Sourav\Analytics\Service Now/server.R#50] 84: func 83: eval 82: eval 81: withVisible 80: evalVis 79: utils::capture.output 78: paste 77: origRenderFunc 76: output$prediction 1: runApp – Sourav94 Mar 21 '18 at 18:49
  • I believe that the columns in train set don't match with the text. May be I need to intersect the columns. Please suggest. – Sourav94 Mar 22 '18 at 06:33