2

I have a situation similar to this one:

I have two datasets, this one:

from to frequency
a a 2
a b 3

and this one:

from to frequency
a a 3
a b 4

Now, I want to merge the two, keeping "from" and "to" variables the same, since they are exactly the same BUT, meanwhile, SUMMING the frequency.

This is what we should get:

from to frequency
a a 5
a b 7

5 Answers5

4

With base R we can use aggregate + rbind

> aggregate(frequency ~ ., rbind(df1, df2), sum)
  from to frequency
1    a  a         5
2    a  b         7

Or, we can use xtabs + as.data.frame + rbind

> as.data.frame(xtabs(frequency ~ ., rbind(df1, df2)))
  from to Freq
1    a  a    5
2    a  b    7

Data

> dput(df1)
structure(list(from = c("a", "a"), to = c("a", "b"), frequency = 2:3), class = "data.frame", row.names = c(NA,
-2L))

> dput(df2)
structure(list(from = c("a", "a"), to = c("a", "b"), frequency = 3:4), class = "data.frame", row.names = c(NA,
-2L))
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
4

One cool (dplyr) option is using the powerjoin package:

library(powerjoin)
power_left_join(df1, df2, by = c("from", "to"), conflict = `+`)

Result:

   from to frequency
1:    a  a         5
2:    a  b         7
Julian
  • 6,586
  • 2
  • 9
  • 33
2

You could do with data.table

library(data.table)
df1 <- data.frame(from = c("a", "a"), to = c("a", "b"), frequency = c(2, 3)) |> setDT()


df2 <- data.frame(from = c("a", "a"), to = c("a", "b"), frequency = c(3, 4)) |> setDT()


df1[df2, on = .(from, to), 
    frequency := x.frequency + i.frequency]

output

     from     to frequency
   <char> <char>     <int>
1:      a      a         5
2:      a      b         7
YH Jang
  • 1,306
  • 5
  • 15
2
df1 <- data.frame(from = c("a", "a"), to = c("a", "b"), frequency = c(2, 3))


df2 <- data.frame(from = c("a", "a"), to = c("a", "b"), frequency = c(3, 4))

# Merge and sum frequencies
merged_df <- bind_rows(df1, df2) %>%
  group_by(from, to) %>%
  summarise(frequency = sum(frequency))

print(merged_df)

  from  to    frequency
  <chr> <chr>     <dbl>
1 a     a             5
2 a     b             7
Yomi.blaze93
  • 401
  • 3
  • 10
2

Using dplyr we can just use two steps

df1 <- data.frame(from = c("a", "a"), to = c("a", "b"), frequency = c(2, 3))
df2 <- data.frame(from = c("a", "a"), to = c("a", "b"), frequency = c(3, 4))

inner_join(df1, df2, by = c("to","from")) %>% 
  mutate(frequency = frequency.x + frequency.y)
t0Ad
  • 19
  • 3