I want to execute a function in R that comes from the following textbook (on p.20, but I posted it below): media.readthedocs.org/pdf/little-book-of-r-for-multivariate-analysis/latest/little-book-of-r-for-multivariate-analysis.pdf
The dataset I'm trying it on (the dataset used in this PDF) can be found here:
wine <- read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data",
sep=",")
The function is first defined as follows, and then executed (last line):
calcBetweenGroupsVariance <- function(variable,groupvariable)
{
# find out how many values the group variable can take
groupvariable2 <- as.factor(groupvariable[[1]])
levels <- levels(groupvariable2)
numlevels <- length(levels)
# calculate the overall grand mean:
grandmean <- mean(variable)
# get the mean and standard deviation for each group:
numtotal <- 0
denomtotal <- 0
for (i in 1:numlevels)
{
leveli <- levels[i]
levelidata <- variable[groupvariable==leveli,]
levelilength <- length(levelidata)
# get the mean and standard deviation for group i:
meani <- mean(levelidata)
sdi <- sd(levelidata)
numi <- levelilength * ((meani - grandmean)^2)
denomi <- levelilength
numtotal <- numtotal + numi
denomtotal <- denomtotal + denomi
}
# calculate the between-groups variance
Vb <- numtotal / (numlevels - 1)
Vb <- Vb[[1]]
return(Vb)
}
calcBetweenGroupsVariance (wine[2],wine[1])
It should give me the between groups variance for the variable "V2" (second column) based on the three labels (first column). Unfortunately, R tells me:
The structure of the dataset looks like this:
I don't know how to solve this. According to str(), the second column contains numerical data. I tried this function also on another dataset with the same issue. I searched upon this error message and there a quite a few topics based on it, but I can't establish any analogy to my problem.
If someone could give me a hint what to do, I would be very gratefule! If you need more information, please tell me.
Thanks a lot in advance,