4

I encountered big problem when trying to apply my micro solution to macro scale. I want to write a function that will allow me to automatize adding all values of specific data frames together.

First, I have created list of all data frames:

> lst
$data001
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

$data002
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

$data003
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80
 Z   20  40  60  80

$data004
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80
 Z   20  40  60  80
 V   20  40  60  80

$data005
 A   B   C   D   E
 Q   10  30  50  70

$data006
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

$data007
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

$data008
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

$data09
 A   B   C   D   E
 X   11  33  55  77
 Y   22  44  66  88

$data010
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

Second, I have determined which data frames I would like to add together (add 1 to 1 and 2 to 2 etc.). In this example there are 10 data frames organized in the following order, within lst:

 [1] 1 1 2 2 2 2 2 2 3 2

Manually adding all "ones" I would look something like this:

> ddply(rbind(lst[[1]],lst[[2]]), "A", numcolwise(sum))

 A   B   C   D    E
 X   20  60  100  140
 Y   40  80  120  160

Manually adding all "two" I would look something like this:

 A   B   C   D    E
 X   60  180 300 420
 Y   120 240 360 480
 Z   40  80  120 160
 V   20  40  60  80
 Q   10  30  50  70

However, I just cannot figure it out how write a loop that will create list with, in this example, 3 data frames that are result of summing up selected data frames.

Thank you in advance!

An economist
  • 1,301
  • 1
  • 15
  • 35
  • 1
    Please show the example that reflect the size. It would be also great if we have a small reproducible example (instead of the `..`). It can greatly help in coding. The idea will be to convert to equal sized datasets. One option is with `rbindlist` i.e. `library(data.table);dt <- rbindlist(lst, idcol=TRUE, fill=TRUE)` – akrun Feb 11 '16 at 12:52
  • Just give me a second so I can create it. Anyways thank you for the input. – An economist Feb 11 '16 at 12:54
  • @akrun I hope it helps.It is just an example, but quite representative. I want to automatize the process because there might be more than 100 data frames. – An economist Feb 11 '16 at 13:08
  • I updated with a possible solution. Can you check whether it is what you wanted. – akrun Feb 11 '16 at 13:24
  • 1
    @akrun Thank you so much, works perfectly! I would never figure it out. – An economist Feb 11 '16 at 13:34

1 Answers1

2

We may use data.table

 library(data.table)
 lapply(split(seq_along(lst), v1), function(i) 
         rbindlist(lst[i], fill=TRUE)[
             , lapply(.SD, sum), A, .SDcols= B:E])
#$`1`
#   A  B  C   D   E
#1: X 20 60 100 140
#2: Y 40 80 120 160

#$`2`
#   A   B   C   D   E
#1: X  60 180 300 420
#2: Y 120 240 360 480
#3: Z  40  80 120 160
#4: V  20  40  60  80
#5: Q  10  30  50  70

#$`3`
#   A  B  C  D  E
#1: X 11 33 55 77
#2: Y 22 44 66 88

data

v1 <-  c(1, 1, 2, 2, 2, 2, 2, 2, 3, 2)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you for such a quick response, I follow your logic and it looks very promising. I am just not very familiar with reduce and probably that is why I got `Error in FUN(left, right) : non-numeric argument to binary operator` . There are obviously non-numeric arguments in first column, so I have to work on that. – An economist Feb 11 '16 at 12:43
  • 1
    @Aneconomist I think the first column is non-numeric. Can you try with the updated solution. – akrun Feb 11 '16 at 12:46
  • Yes, you were fully right that the first column is non-numeric. One more problem, which I forgot to mention that the data frames might have different sizes and `'+'` works only for equally-sized data frames. – An economist Feb 11 '16 at 12:49
  • @Aneconomist Okay, in that case, this won't work. We need to convert the different size datasets to equal sizes. Can you update with some example that shows the problem. – akrun Feb 11 '16 at 12:50