-2

Why I have to use means [i] and not means only to store the result and when I do that it shows me the some other value.

mymeans <- function(x){

    means <- numeric(ncol(x))

    for (i in 1:ncol(x)){

        means[i] <- mean(x[,i])
    }
    return(means[i])
}
Joseph Crispell
  • 395
  • 2
  • 8

1 Answers1

0

Inside your function you are using the square brackets to reference an index (i) in your vector means. Without this, you will only return the mean of the last column.

To get started, here is your function working on some example data:

# Create an example set of data
randomValues <- data.frame("A"=rnorm(100, mean=0), "B"=rnorm(100, mean=5), "C"=rnorm(100, mean=500))

# Create a function to calculate the mean value of each column
meanOfEachColumn <- function(dataframe){

    # Initialise a vector to store the calculated means
    means <- numeric(length=ncol(dataframe))

    # Examine each column in the dataframe
    for(column in 1:ncol(dataframe)){

        means[column] <- mean(dataframe[, column])
    }

    return(means)
}

# Calculate the mean of each column
meansOfColumns <- meanOfEachColumn(randomValues)
print(meansOfColumns)

0.04983223 4.93306557 500.21016834

The means[column] in the above code means that the mean of each column is stored in that position in the vector that is returned.

Without it you would get the following:

# Create a function to calculate the mean value of each column
meanOfEachColumn <- function(dataframe){

    # Initialise a vector to store the calculated means
    means <- numeric(length=ncol(dataframe))

    # Examine each column in the dataframe
    for(column in 1:ncol(dataframe)){

        means <- mean(dataframe[, column])
    }

    return(means)
}

# Calculate the mean of each column
meansOfColumns <- meanOfEachColumn(randomValues)
print(meansOfColumns)

500.2101

Which you'll notice is the mean of the last column in the example dataframe randomValues. This means the vector of means (means) has been replaced the mean of the last column in the dataframe.

Also, as a general note you should try and have a reproducible example alongside your question. See this post for more details.

Lastly, there is a function in R that already calculates the mean of every column in a dataframe:

meansOfColumns <- colMeans(randomValues)
Joseph Crispell
  • 395
  • 2
  • 8