-1

I have the R iris dataset which I am using for a PNN. The 3 species have been recoded from level 0 to 3 as follows: 0 is setosa, 1 is versicolor, 2 is virginica. Training set is 75%

Q1. I don't understand the function pred_pnn, if anyone is good in R perhaps you can explain how it works

Q2. The output of the test set or prediction is shown below, I don't understand the output because it is supposed to be something close to either 0,1,2

data = read.csc("c:/iris-recoded.csv" , header = T)
size = nrow(data)
length = ncol(data)
index <- 1:size
positions <- sample(index, trunc(size * 0.75))

training <- data[positions,]
testing <- data[-positions,1:length-1]
result = data[-positions,]
result$actual = result[,length]
result$predict = -1
nn1 <- smooth(learn(training), sigma = 0.9)

pred_pnn <- function(x, nn){
  xlst <- split(x, 1:nrow(x))
  pred <- foreach(i = xlst, .combine = rbind) %dopar% {
  data.frame(prob = guess(nn, as.matrix(i))$probabilities[1], row.names =NULL)                  
  }
}

print(pred_pnn(testing, nn1))
             prob
1  1.850818e-03
2  9.820653e-03
3  6.798603e-04
4  7.421435e-03
5  2.168817e-03
6  3.277354e-03
7  6.541173e-03
8  1.725332e-04
9  2.081845e-03
10 2.491388e-02
11 7.679823e-03
12 1.291811e-03
13 2.197234e-06
14 1.316366e-03
15 1.421219e-05
16 4.639239e-05
17 3.671907e-04
18 1.460001e-04
19 4.382849e-05
20 2.387543e-05
21 1.011196e-05
22 2.719982e-04
23 4.445472e-04
24 1.281762e-04
25 5.931106e-09
26 9.741870e-08
27 9.236434e-09
28 8.384690e-08
29 3.311667e-07
30 6.045306e-11
31 2.949265e-08
32 2.070014e-10
33 8.043735e-06
34 2.136666e-08
35 5.604398e-08
36 2.455841e-07
37 3.445977e-07
38 7.314647e-07
user4745212
  • 51
  • 1
  • 3
  • 11

1 Answers1

2

I'm assuming you're using the pnn package. Documentation for ?guess would lead us to believe that it does similar to what predict does for other models. In other words, it predicts to which class the observation belongs to. Everything else in there for bookkeeping. Why you get only the probabilities? Because the person who wrote the function made it that way by extracting guess(x)$probabilities and returning only that. If you look at the raw output, you would also get predicted class tucked in away in $category list element.

Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
  • Thank you. But I am not really sure how to look at the raw output. I tried this but it does not look right ----------- > guess(nn1,c(1,5))$category [1] "6" > guess(nn1,c(2,5))$category [1] "6.5" > guess(nn1,c(3,5))$category [1] "6.2" – user4745212 Apr 14 '15 at 07:30
  • @user4745212 can you make your question reproducible? I have little experience with this package and it would be easier to work with concrete data. – Roman Luštrik Apr 14 '15 at 08:13
  • I am not sure what you mean by reproducible. I can email you the dataset. I am also new to pnn so not much help to you. – user4745212 Apr 14 '15 at 08:37
  • Simulate some data (or load the existing iris data set and make appropriate modifications to it) and use that in the code you provided. Reproducible means that I just copy/paste everything into R and everything works fine. I make some adjustments if needed and paste back the working result. More importantly, it gives you time to reflect on your data, workflow and the desired result. – Roman Luštrik Apr 14 '15 at 08:59
  • @user4745212 You can read more about reproducibility [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Roman Luštrik Apr 14 '15 at 09:00
  • An example in my [blog](http://www.parallelr.com/r-deep-neural-network-from-scratch/) – Patric Feb 19 '16 at 05:33