Could you comment how to handle the following non-linear data (svm regression):
tt <- c(1.38, 1.41, 1.38, 1.57, 1.65, 1.45, 1.38, 1.38, 1.38, 1.69, 2.18, 1.89, 0.00, 0.00, 1.20, 0.00, 1.23, 1.40, 1.38, 1.38, 1.38, 1.08, 1.40, 1.88, 1.76, 1.70, 1.87, 0.00, 1.90, 1.40, 0.00, 1.46, 1.51, 0.01, 1.90, 1.63, 0.00, 0.00, 0.01, 2.00, 1.40, 0.00, 1.69, 1.68, 1.70, 1.40, 1.40, 1.64, 1.98, 2.00, 1.40, 2.00, 2.00, 1.78,1.56, 1.46, 1.69, 1.40, 1.87, 1.38, 0.00, 1.40, 1.43, 1.40, 1.69, 1.69, 1.88, 0.94, 1.69, 1.71, 1.57,1.38, 1.10, 1.70, 2.00, 1.70, 1.08, 1.70, 0.00, 1.70, 1.80,0.00, 1.58, 1.80, 1.69, 1.77, 0.00, 0.00, 1.38, 0.00, 0.00, 1.38, 0.00, 0.00)
pp <- c(4, 6, 6, 5, 6, 5, 4, 4, 4, 5, 7, 5, 6 , 6 , 4, 4, 5 , 4 , 5 , 5 , 5 ,6 , 5 , 5, 6 , 7 , 5, 6 , 4 , 4 , 6, 6 , 6 , 8, 5, 6 , 6 , 5 , 8, 7 , 6, 6, 5 , 5, 6, 6, 6, 5, 5, 5, 5, 6, 7, 6, 4, 6, 5, 6, 6, 6, 8, 6, 4, 4, 5, 5, 6, 6, 7, 4, 6, 4, 4, 5, 5, 4, 4, 6, 10, 7, 6, 10, 5, 7, 5, 4, 8, 7, 4, 6, 4, 4, 4, 6)
qq <- c(2, 2, 2, 3, 1, 3, 3, 3, 3, 1, 0, 2, 0, 2, 3, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 3, 0, 1, 3, 1, 1, 1, 0, 1, 1, 2, 2, 2, 1, 1, 1, 2, 3, 0, 1, 3, 0, 1, 0, 2, 3, 3, 1, 1, 1, 0, 0, 2, 3, 3, 2, 1, 3, 0, 3, 3, 2, 1, 1, 2, 2, 0, 3, 2, 1, 0, 3, 4, 2, 3, 3, 1)
I tried like this, for example
library(kernlab)
huh <- data.frame(tt,pp,qq)
index <- 1:nrow(huh)
testindex <- sample(index, trunc(length(index)/3))
testset <-huh[testindex,]
trainset <- huh[-testindex,]
mod <- ksvm(tt ~pp+qq, data =trainset,type = "eps-svr", kernel = "rbfdot",kpar ="automatic", C = 10, prob.model = TRUE)
the result looks like
Support Vector Machine object of class "ksvm"
SV type: eps-svr (regression)
parameter : epsilon = 0.1 cost C = 10
Gaussian Radial Basis kernel function.
Hyperparameter : sigma = 0.637663227203429
Number of Support Vectors : 55
Objective Function Value : -224.1407
Training error : 0.581297
Laplace distr. width : 1.320399
I can extract coefficients and bias (w and b) but I can't find the slack variables (soft-margin) that define the loss-function. Can you suggest me another option to fit such type of data?