0

I need to merge multiple dataframe with the matching values in column A. What is the most efficient way to do this and get the result.

df1

A  B  C
2  x  r
1  c  r
3  y  t

df2

A  D  E
3  e  y
1  t  t 
2  y  t

df3

A  F  G
1  g  y
2  f  y
3  h  k

result

A  B  C  D  E  F  G  
1  c  r  t  t  g  y
2  x  r  y  t  f  y
3  y  t  y  t  h  k
MAPK
  • 5,635
  • 4
  • 37
  • 88
  • Is that really just a `cbind` or are you looking for a join of common values where not all values of `A` exist in all data frames? – Gopala Apr 20 '16 at 01:13
  • @Gopala No I do not want to do cbind. I am not sure if all the dfs have same number of rows. – MAPK Apr 20 '16 at 01:14
  • 1
    Interesting. Yeah, it is a very good solution if you truly have a variable length list of data frames. Thank you for posting. – Gopala Apr 20 '16 at 01:22

1 Answers1

3

One solution is to use dplyr package and it's inner_join as follows:

library(dplyr)
df <- inner_join(df1, df2)
df <- inner_join(df, df3)

Resulting output is:

df
  A B C D E F G
1 2 x r y t f y
2 1 c r t t g y
3 3 y t e y h k

Note, inner_join keeps only rows where A matches.

If you want it arranged by column A, you can add this line:

arrange(df, A)
  A B C D E F G
1 1 c r t t g y
2 2 x r y t f y
3 3 y t e y h k

To merge a variable length list of data frames, it appears Reduce can be helpful along with the above inner_join:

df <- Reduce(inner_join, list(df1, df2, df3))
arrange(df, A)
  A B C D E F G
1 1 c r t t g y
2 2 x r y t f y
3 3 y t e y h k
Gopala
  • 10,363
  • 7
  • 45
  • 77
  • Thanks , but merging the list of dataframe would be much better. – MAPK Apr 20 '16 at 01:18
  • You can use base package's `merge` by nesting the merges as I am basically nesting the merges using `inner_join`. It will be slower. Unaware of any other solution, but will see if others post something different. – Gopala Apr 20 '16 at 01:21
  • Added the `Reduce` option with `inner_join`. Not sure if that fits what you want. – Gopala Apr 20 '16 at 01:26