Inside your function you are using the square brackets to reference an index (i
) in your vector means
. Without this, you will only return the mean of the last column.
To get started, here is your function working on some example data:
# Create an example set of data
randomValues <- data.frame("A"=rnorm(100, mean=0), "B"=rnorm(100, mean=5), "C"=rnorm(100, mean=500))
# Create a function to calculate the mean value of each column
meanOfEachColumn <- function(dataframe){
# Initialise a vector to store the calculated means
means <- numeric(length=ncol(dataframe))
# Examine each column in the dataframe
for(column in 1:ncol(dataframe)){
means[column] <- mean(dataframe[, column])
}
return(means)
}
# Calculate the mean of each column
meansOfColumns <- meanOfEachColumn(randomValues)
print(meansOfColumns)
0.04983223 4.93306557 500.21016834
The means[column]
in the above code means that the mean of each column is stored in that position in the vector that is returned.
Without it you would get the following:
# Create a function to calculate the mean value of each column
meanOfEachColumn <- function(dataframe){
# Initialise a vector to store the calculated means
means <- numeric(length=ncol(dataframe))
# Examine each column in the dataframe
for(column in 1:ncol(dataframe)){
means <- mean(dataframe[, column])
}
return(means)
}
# Calculate the mean of each column
meansOfColumns <- meanOfEachColumn(randomValues)
print(meansOfColumns)
500.2101
Which you'll notice is the mean of the last column in the example dataframe randomValues
. This means the vector of means (means
) has been replaced the mean of the last column in the dataframe.
Also, as a general note you should try and have a reproducible example alongside your question. See this post for more details.
Lastly, there is a function in R that already calculates the mean of every column in a dataframe:
meansOfColumns <- colMeans(randomValues)