3

I have a dataframe, df1:

Type     CA     AR     Total
alpha    2      3        5
beta     1      5        6
gamma    6      2        8
delta    8      1        9

and a dataframe, df2:

Type     AR     CA     Total
alpha    3      4        7
beta     2      6        8
gamma    9      1        10
delta    4      1        5

I want to add the two dataframes such that the values under "CA" are added together and that the values under "AR" are added together. Basically, the values under each heading should be added together.

The resulting df should look like this:

    Type     AR     CA     Total
    alpha    6      6        12
    beta     7      7        14
    gamma    11     7        18
    delta    5      9        14

For example: (AR, gamma) = 2 + 9 = 11

Jaap
  • 81,064
  • 34
  • 182
  • 193
user4918087
  • 421
  • 1
  • 6
  • 14

3 Answers3

4

The safest way would probably be to bind and aggregate

aggregate(.~Type, rbind(df1,df2), sum)
#    Type CA AR Total
# 1 alpha  6  6    12
# 2  beta  7  7    14
# 3 delta  9  5    14
# 4 gamma  7 11    18

The rbind.data.frame function pays attention to column names so it will properly stack your values.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
3

I'll repeat my suggestion from the comments last time -- consider putting Type in rownames:

DF1 <- data.frame(df1[-1],row.names=df1$Type)
DF2 <- data.frame(df2[-1],row.names=df2$Type)

From here, adding is straightforward:

DF1 + DF2[names(DF1)]
#       CA AR Total
# alpha  6  6    12
# beta   7  7    14
# gamma  7 11    18
# delta  9  5    14

A couple of caveats: If your rows are not ordered the same way, this will not work correctly (that's why @MrFlick's approach is "safe"). Also, the extension to more data frames isn't so elegant here:

Reduce(`+`,lapply(list(DF2,DF3,DF4),`[`,order(names(DF1))),init=DF1) # here
aggregate(.~Type, rbind(df1,df2,df3,df4), sum)                       # @MrFlick
Community
  • 1
  • 1
Frank
  • 66,179
  • 8
  • 96
  • 180
  • I'm not sure that I would really consider putting the "Type" as `rownames`. For starters, `rownames` won't allow duplicates. For another, at least two very commonly used packages today ignore `rownames` unless specifically requested, and even then, put the `rownames` back in as a column. That said, duplicated `rownames` *are* allowed with matrices. – A5C1D2H2I1M1N2O1R2T1 May 30 '15 at 03:46
  • @AnandaMahto My original suggestion (http://stackoverflow.com/a/30489085/1191259) was to use matrices, but that got no traction, judging by the OP's continued use of the same example data in several subsequent questions. I figured using rownames on the data.frame was at least a step in that direction, but yeah, I probably wouldn't do it, since I always use matrices or data.tables. I suspect there are no duplicate rows in the OP's case (since this seems to be a summary table for display purposes, and dupes on the margins would be weird), but really don't know. – Frank May 30 '15 at 05:13
2

You can consider storing your data in a "long" form instead, which would make further operations more straightforward.

If you have your data.frames in a list, you can easily use melt from "reshape2" to get a "long" data.frame. For example:

melt(list(df1, df2), id.vars = "Type")

Once the data are in the long form, you can reshape it to a "wide" form using dcast, and perform whatever aggregation you want to at that stage.


Furthermore, you can generalize the creation of the list if you have similarly named data.frames in your workspace by using mget.

Here's an example, assuming we have two data.frames, one named "df1", and one named "df2":

library(reshape2)
dcast(melt(mget(ls(pattern = "df\\d+")), id.vars = "Type"), 
      Type ~ variable, value.var = "value", fun.aggregate = sum)
#    Type CA AR Total
# 1 alpha  6  6    12
# 2  beta  7  7    14
# 3 delta  9  5    14
# 4 gamma  7 11    18
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485