Given df1
as
df1 = data.frame(a = c(1,2,2),
b = c(2, 2, 3), c = c(3, 2, 1),
label = c("abc", "bbb", "bca"))
> df1
a b c label
1 1 2 3 abc
2 2 2 2 bbb
3 2 3 1 bca
and df2
as
df2 = data.frame(a = c(2, 2, 1, 1, 1, 2), b = c(2, 2, 2, 2, 2, 3),
c = c(2, 2, 3, 3, 3, 1), d = c(0.1, 0.2, 0.8, 0.7, 0.6, 0.9))
> df2
a b c d
1 2 2 2 0.1
2 2 2 2 0.2
3 1 2 3 0.8
4 1 2 3 0.7
5 1 2 3 0.6
6 2 3 1 0.9
What is the best way to merge them in order to get df3
as following?
df3 = data.frame(a = c(2, 2, 1, 1, 1, 2), b = c(2, 2, 2, 2, 2, 3),
c = c(2, 2, 3, 3, 3, 1), d = c(0.1, 0.2, 0.8, 0.7, 0.6, 0.9),
label = c("bbb", "bbb", "abc", "abc", "abc", "bca"))
> df3
a b c d label
1 2 2 2 0.1 bbb
2 2 2 2 0.2 bbb
3 1 2 3 0.8 abc
4 1 2 3 0.7 abc
5 1 2 3 0.6 abc
6 2 3 1 0.9 bca
I want the output to be the same size as df2
with all its attributes and values, plus an additional label
column from df1
which indicates the unique combination group of a
, b
, c
attributes.
I tried the basic merge
function as
df4 = merge(df1, df2)
and it works fine for my sample data frames (like this example here), but when I want to apply this logic to larger data sets it doesn't return the output I expect. Specially the size of out put is not the same size as df2
, but smaller size.
I was wondering if there is any alternative way to do this, or any idea about what can goes wrong applying the same logic (which seems to work fine in small data frames) in large data sets. Thanks!