3

I am trying to get the sum of elements based on unique names across a list containing unknown number of dataframes.

## Test Data
Name1 <- c("A","B","C","D")
Name2 <- c("A","D")
Name3 <- c("B","C","F")
Values1 <- c(1,2,3,4)
Values2 <- c(5,7)
Values3 <- c(6,8,9)
DF1 <- data.frame(Name1,Values1,stringsAsFactors = FALSE)
DF2 <- data.frame(Name2,Values2,stringsAsFactors = FALSE)
DF3 <- data.frame(Name3,Values3,stringsAsFactors = FALSE)

DFList <- list(DF1,DF2,DF3)

My Output will be:

A  B  C  D  F 
6  8 11 11  9 

I am not sure if using a loop is effective, since there can be any number of dataframes in the list and the number of unique rows in a dataframe can range anywhere between 100,000 to 1 Million.

Jason Mathews
  • 265
  • 1
  • 3
  • 13

3 Answers3

4

Solution using data.table::rbindlist:

data.table::rbindlist(DFList)[, sum(Values1), Name1]
   Name1 V1
1:     A  6
2:     B  8
3:     C 11
4:     D 11
5:     F  9

rbindlist binds columns despite their names and then you can sum(Values1) by Name1.

pogibas
  • 27,303
  • 19
  • 84
  • 117
3
sapply(split(unlist(lapply(DFList, "[[", 2)), unlist(lapply(DFList, "[[", 1))), sum)
# A  B  C  D  F 
# 6  8 11 11  9 

OR

aggregate(formula = Value~Name,
          data = do.call(rbind, lapply(DFList, function(x) setNames(x, c("Name", "Value")))),
          FUN = sum)
#  Name Value
#1    A     6
#2    B     8
#3    C    11
#4    D    11
#5    F     9
d.b
  • 32,245
  • 6
  • 36
  • 77
2

Similar to the answer of @d.b.

lst <- unlist(lapply(DFList, function(DF) setNames(DF[[2]], DF[[1]])))
tapply(lst, names(lst), sum)
#A  B  C  D  F 
#6  8 11 11  9
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66