-1

I'm attempting to collapse a dataframe onto itself. The aggregate dataset seems like my best bet but I'm not sure how to have some columns add themselves and others remain the same.

My dataframe looks like this

A  1  3  2
A  2  3  4
B  1  2  4
B  4  2  2

How can I use the aggergate function or the ddply function to create something that looks like this:

A  3  3  6
B  5  2  6
thelatemail
  • 91,185
  • 12
  • 128
  • 188

1 Answers1

0

We can use dplyr

library(dplyr)
df1 %>%
   group_by(col1) %>% 
   summarise_each(funs(if(n_distinct(.)==1) .[1] else sum(.)))

Or another option if the column 'col3' is the same would be to keep it in the group_by and then summarise others

df1 %>%
   group_by(col1, col3) %>%
   summarise_each(funs(sum))
#   col1  col3  col2  col4
#  <chr> <int> <int> <int>
#1     A     3     3     6
#2     B     2     5     6

Or with aggregate

 aggregate(.~col1+col3, df1, FUN = sum)
 #   col1 col3 col2 col4
 #1    B    2    5    6
 #2    A    3    3    6
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Or with `data.table` - `df1[, lapply(.SD, sum), by=.(V1,V3)]` - `df1[, lapply(.SD, sum), by=.(V1,V3)][, names(df1), with=FALSE]` if the order of the columns is important. – thelatemail Nov 29 '16 at 02:43
  • I'm running into a "'sum' not meaningful for factors" error. I'm assuming that means my dataframe is being created improperly and I need to convert it to something else. Is there a good way to do that? – Alex Tippett Nov 29 '16 at 02:55
  • @AlexTippett I think you may have created a matrix and then convert it to data.frame. One way is to do `df1[] <- lapply(df1, function(x) type.convert(as.character(x)))` then apply the codes. – akrun Nov 29 '16 at 04:22