1

I am trying to train an SVM model using Forest Fire data. I split up my data into a test and training set. I am not sure of the syntax to address the variables "day" and "month". They are categorical variables. Am I going about it the correct way? I considered using as.numeric to turn the categorical values into numeric values but I think it would negatively affect my svm. Data comes from https://archive.ics.uci.edu/ml/datasets/Forest+Fires My code is posted below. On another note, my response variable is transformed with ln(x+1) to account with skewed data.

     forestfires = read.csv("forestfires.csv")  # read csv file
    head(forestfires)
    summary(forestfires)

    #build training/ test sample sample
    set.seed(0508)
    sample<-sample(1:nrow(forestfires), 0.75*nrow(forestfires))
    testfire<-forestfires[sample,]
    trainfire<-forestfires[-sample,]

    #Build SVM model
    library(kernlab)

    vmod<-ksvm(log(area+1)~X+Y+as.factor(month)+as.factor(day)+
    FFMC+DMC+DC+ISI+temp+RH+wind+rain, data=trainfire, type="nu-svr")
Vindication09
  • 45
  • 2
  • 8

0 Answers0