I have the R iris dataset which I am using for a PNN. The 3 species have been recoded from level 0 to 3 as follows: 0 is setosa, 1 is versicolor, 2 is virginica. Training set is 75%
Q1. I don't understand the function pred_pnn
, if anyone is good in R perhaps you can explain how it works
Q2. The output of the test set or prediction is shown below, I don't understand the output because it is supposed to be something close to either 0,1,2
data = read.csc("c:/iris-recoded.csv" , header = T)
size = nrow(data)
length = ncol(data)
index <- 1:size
positions <- sample(index, trunc(size * 0.75))
training <- data[positions,]
testing <- data[-positions,1:length-1]
result = data[-positions,]
result$actual = result[,length]
result$predict = -1
nn1 <- smooth(learn(training), sigma = 0.9)
pred_pnn <- function(x, nn){
xlst <- split(x, 1:nrow(x))
pred <- foreach(i = xlst, .combine = rbind) %dopar% {
data.frame(prob = guess(nn, as.matrix(i))$probabilities[1], row.names =NULL)
}
}
print(pred_pnn(testing, nn1))
prob
1 1.850818e-03
2 9.820653e-03
3 6.798603e-04
4 7.421435e-03
5 2.168817e-03
6 3.277354e-03
7 6.541173e-03
8 1.725332e-04
9 2.081845e-03
10 2.491388e-02
11 7.679823e-03
12 1.291811e-03
13 2.197234e-06
14 1.316366e-03
15 1.421219e-05
16 4.639239e-05
17 3.671907e-04
18 1.460001e-04
19 4.382849e-05
20 2.387543e-05
21 1.011196e-05
22 2.719982e-04
23 4.445472e-04
24 1.281762e-04
25 5.931106e-09
26 9.741870e-08
27 9.236434e-09
28 8.384690e-08
29 3.311667e-07
30 6.045306e-11
31 2.949265e-08
32 2.070014e-10
33 8.043735e-06
34 2.136666e-08
35 5.604398e-08
36 2.455841e-07
37 3.445977e-07
38 7.314647e-07