Right now I have a training loop to test different model parameters for the kknn
package, and it looks like this:
# generate validation results
kernel <- c('gaussian', 'optimal', 'rectangular', 'biweight', 'cos', 'inv', 'triangular', 'epanechnikov')
# empty array to hold the results
results <- array(dim = c(length(kernel)*50, 4), dimnames = list(NULL, c('K', 'MSE', 'MAE', 'KERNEL')))
start = 1
stop = 50
# run the loop
for (i in kernel) {
model <- train.kknn(R1~., data, kmax = 50, kernel = i)
results[start:stop, 1] = 1:50
results[start:stop, 2] = model$MEAN.SQU
results[start:stop, 3] = model$MEAN.ABS
results[start:stop, 4] = i
start = start + 50
stop = stop + 50
}
This works fine enough. However, I want to eventually use the summarize
function in dplyr
to look at my model results, but the main problem I'm running into is that the values in results
seem to all be strings.
If I call typeof
on each column in results
it returns character
, but I would assume it should return double
instead.
If I run results %>% group_by(K) %>% summarize(mean_val = mean(MSE))
then I get the error message
Error in UseMethod("group_by_"): no applicable method for 'group_by_' applied to an object of class "c('matrix', 'character')"
which I assume means that you can't groupby on something without numeric values.
Any tips on what I'm doing incorrectly would be much appreciated. Thank you!
EDIT
It was noted in the comments that dplyr
commands only work with a data.frame
and a tibble
. However, converting the results
array into either of these does not work either.
If I run the line:
results = data.frame(results)
Running str(results)
returns the following picture:
[![enter image description here][1]][1]
I get something similar for using as_tibble
in place of data.frame
.
Running the dplyr
commands gives the following error message:
"argument is not numeric or logical: returning NA"Warning message in mean.default(MSE):
So I think I'm still about where I started.
Thank you. [1]: https://i.stack.imgur.com/QsO7o.png