-2

I am looking to find the sum of certain columns in my dataset. currently it looks something like this.

I want to find the column sum of everyone in X, Y and Z for each possible grid and month combination. Currently I have

xx<-data[data$Month=="November"&data$grid=="A3",]

fun<-by(xx[, 1:3],xx$grid, colSums,na.rm=T)
fun<-as.character(fun)

as.data.frame(fun,
              stringsAsFactors = default.stringsAsFactors())

But this requires me to change the grid ref and month ref each time, is there a simpler way to do it without manually specifying which grids and months I want.

nicola
  • 24,005
  • 3
  • 35
  • 56
Fosulli
  • 13
  • 4

1 Answers1

0

We can use summarise_each from dplyr after grouping by 'month', 'grid'

library(dplyr)
data %>%
   group_by(month, grid) %>%
   summarise_each(funs(sum))

Or with aggregate from base R

aggregate(.~month + grid, data, FUN = sum)

Or using the OP's method

by(data[1:3], data[4:5], FUN = colSums)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thanks, But I still can't get any of those to work. I can't use "dplr" on the version of R I am currently using and I downloaded the newest version but it is having trouble finding the command "group_by" even though it says I have successfully installed dplyr. the aggregate function comes up with in array(dim = extent, dimnames = namelist) : negative length vectors are not allowed In addition: Warning messages: 1: In ngroup * (as.integer(index) - one) : NAs produced by integer overflow 2: In ngroup * nlevels(index) : NAs produced by integer overflow" – Fosulli Jul 07 '16 at 11:40
  • @Fosulli As you haven't showed any reproducible example in your post, it is difficult to comment. Have you loaded the library? i.e. `library(dplyr)` – akrun Jul 07 '16 at 12:13
  • Sorry, I should have explained it better. When running the code data %>% group_by(month, grid) %>% summarise_each(funs(sum)) the error comes up that "sum" is not meaningful for factors. How do you specify that you want to sum columns x,y and z? – Fosulli Jul 07 '16 at 13:18
  • I figured it out, I had another factor variable in the dataset and it wouldn't work with the "sum" function. thank you for your help. – Fosulli Jul 07 '16 at 13:36