2

Very basic question here! On a 10 x 2 data frame, I want to calculate the average of each column and place the result in a new 1 x 2 matrix.

colmeans calculates the averages, but places the result in a 2 x 1 vector. Attempting to transpose the vector gives error message "Error in transpose(AverageX) : l must be a list."

As a workaround, I created an empty 1 x 2 matrix (filled with NAs), then used rbind to merge the Averages vector with the empty matrix, and then deleted the NAs.

It works but I bet could do the same with two lines of code rather than four. Would anyone have a better way to achieve this? Thanks very much for the help.

df <- data.frame(Price = c(1219, 1218, 1220, 1216, 1217, 1218, 1218, 1207, 1206, 1205), XXX = c( 1218, 1218, 1219, 1218, 1221,  1217 , 1217, 1216, 1219, 1216))

Average <- colMeans(df, na.rm = TRUE) #result is a vector => turn it into a 1xm matrix
AverageMatrix <- matrix(nrow = 1, ncol = 2)
AverageMatrix <- rbind(AverageMatrix, Average)
AverageMatrix<- AverageMatrix[-1, , drop = FALSE]

EDIT: @akrun creating a matrix of averages and creating an average for each column of a matrix are similar yet technically different issues, hence in my opinion this is not a duplicate. The question expresses that I already knew how to calculate an average for each column of a matrix, and that I wanted a more efficient way to store that result as a matrix, a simple issue that is not addressed in the other question.

Krug
  • 1,003
  • 13
  • 33

3 Answers3

3

You can get a named matrix of column means by simply transposing the result from colMeans(), because (from help(t))

When x is a vector, it is treated as a column, i.e., the result is a 1-row matrix.

t(colMeans(df, na.rm = TRUE))
#       Price    XXX
# [1,] 1214.4 1217.9
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
2

With data.table, we can loop through the columns and get the mean (after converting the 'data.frame' to 'data.table' (setDT(df))

library(data.table)
setDT(df)[, lapply(.SD, mean, na.rm=TRUE)]
#   Price    XXX
#1: 1214.4 1217.9
akrun
  • 874,273
  • 37
  • 540
  • 662
1

I use dplyr for stuff like this. Using dplyr you can:

df <- data.frame(Price = c(1219, 1218, 1220, 1216, 1217, 1218, 1218, 1207, 1206, 1205), XXX = c( 1218, 1218, 1219, 1218, 1221,  1217 , 1217, 1216, 1219, 1216))

library(dplyr)
res <- df %>% summarize(mean(Price), mean(XXX))
res
#  mean(Price) mean(XXX)
#1      1214.4    1217.9

Edit: I forgot you wanted it as a matrix:

res <- df %>% summarize(mean(Price), mean(XXX)) %>% as.matrix
AllanT
  • 923
  • 11
  • 23
  • Thanks a lot AllanT. Prefer the columns to not change names, and actually am applying this on hundreds of columns, hence will be using one of the other answers. Very useful nonetheless.. – Krug Apr 19 '16 at 23:50