0

I have R code for importing text data into R, remove stop words, stem words and then create a matrix. Below is the code for

  1. Using the create container function to split the matrix into training and test data sets.
  2. Use train_models function to create a model based on SVM.
  3. Execute the model on test.
  4. Then I save the model.

    library("RTextTools")
    container = create_container(matrix, as.numeric(as.factor(data[, 2])), 
                        trainSize = 1:2800,testSize = 2801:3162, virgin = FALSE)
    models = train_models(container,"SVM", kernel = "linear",cost =1)
    results = classify_models(container, models)
    save(models, file = "my_model1.rda")
    

    I am not able to use the saved model for prediction on new data(matrix_new) using predict function.

    p <- predict(models,matrix_new)
    #Error in predict.svm(X[[1L]], ...) : test data does not match model !
    

My question is: Is it feasible to use saved models on new data to predict sentiment ? From the error it looks like there is mismatch between the words that were used while creating the model and the new data. Please clarify if my understanding is correct.

Anders Ellern Bilgrau
  • 9,928
  • 1
  • 30
  • 37
Sri
  • 11
  • 4
  • 2
    You'll (faster) get better answers if you provide a working minimal example with data. The code you have provided does not run (e.g. the `data` object is missing) so it is not easy to help you. – Anders Ellern Bilgrau Jul 11 '17 at 07:41
  • @AEBilgrau - I am not sure where to paste the data , hence giving a brief. **data** is a data frame created from a csv with two columns "Comments" and "Sentiment" . below is the code for creating data and matrix. data <- read.csv(file="D:/Sentiment_Analysis/R_Codes/Comments.csv", header=TRUE, sep=",") # build dtm matrix= create_matrix(data[,1], language="english", removeStopwords=TRUE, removeNumbers=TRUE, stemWords=TRUE) – Sri Jul 11 '17 at 09:26
  • Thanks, but that does also not help much as we obviously do not have access to your local files. Please read this [question and answers](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Functions like `dput` will help you. – Anders Ellern Bilgrau Jul 11 '17 at 09:58

0 Answers0