3

I'm new to R. The professor asked us to obtain sum, mean and variance for several columns of data which are in Excel form. Now, I want to try to use R to solve them rather than enter the formula in Excel and drag. I have imported the data into R and they are correctly displayed. I can use the commands sum () and sd () and var () for EACH column.

My question is: is there a way to let R display the sum, sd, and variance for each column at the same time? (Rather than calculating these again and again for each column).

I mean something like colSum(col1, col2, col3,...) and the line just shows the sum for each column.

lmo
  • 37,904
  • 9
  • 56
  • 69
pythh
  • 61
  • 1
  • 1
  • 5
  • check this one http://stackoverflow.com/questions/21807987/calculate-the-mean-for-each-column-of-a-matrix-in-r – Hari Feb 09 '17 at 17:58
  • `colSums` is there for the column sums, but you will have to use `lapply` with a custom function for the `sd` and `var` across columns.. – Rich Scriven Feb 09 '17 at 18:16
  • @RichScriven Thanks! Check my answer. It worked in rstudio. – pythh Feb 09 '17 at 18:19

2 Answers2

10

More generally you would do something like:

sapply(data, sum)
sapply(data, var)
sapply(data, sd)

Or in one line as suggested by Agile Bean:

sapply(data, function(x) c(sum=sum(x), var=var(x), sd=sd(x)))
s_baldur
  • 29,441
  • 4
  • 36
  • 69
  • 1
    this is a nice solution - just to avoid redundancy, you can express the above three lines by `funs <- function(x) { c(sum=sum(x), var=var(x), sd=sd(x))} sapply(data, funs)` For more than three functions, this would still remain two lines of code. – Agile Bean Jan 16 '19 at 04:37
3

I just figured it out. Basically I need to use colSums() and colMeans(). For example, colSums (,data[2:5]). This means we can calculate the sum for each column from column 2 to column 5.

pythh
  • 61
  • 1
  • 1
  • 5