0

Suppose I have three different data sets (practically I have more) with different variables but one which is ID, I would like to merge them by IDs which are common to all data and keep all the columns.

df1=data.frame(ID=c(1:10), x1=rnorm(10), y1=rnorm(10), z1=rnorm(10))
df2=data.frame(ID=c(2:11),x2=rpois(10,3), y1=rbinom(10,5,0.64),z2=rbinom(10,1,0.80) )
df3=data.frame(ID=c(5:14),x3=rbinom(10,1,0.80), y3=rnorm(10),x3=rpois(10,4) )
same_id=Reduce(intersect,list(df1$ID, df2$ID, df3$ID))
> same_id
[1]  5  6  7  8  9 10

I would like to create a data set that will have only same_id and their corresponding values for all variables.

Any help is appreciate

Uddin
  • 754
  • 5
  • 18

1 Answers1

1

Using dplyr:

> library(dplyr)
> df1 %>% inner_join(df2, by = 'ID') %>% inner_join(df3, by = 'ID')
  ID         x1        y1.x         z1 x2 y1.y z2 x3          y3 x3.1
1  5  2.4167734  0.03455107 -0.6470456  2    1  1  1 -0.09294102    9
2  6 -1.6510489  0.19023032 -2.1576463  1    3  1  0  0.43028470    5
3  7 -0.4639872  0.17472640  0.8842508  2    3  1  1  0.53539884    4
4  8  0.8253799 -1.05501704 -0.8294776  2    3  1  1 -0.55527835    4
5  9  0.5101325  0.47613328 -0.5735603  2    3  1  1  1.77950291    7
6 10 -0.5894810  1.37857014  1.5039006  3    5  1  1  0.28642442    2
> 
Karthik S
  • 11,348
  • 2
  • 11
  • 25
  • what if I have 20 or 30 data sets? Do I have to go in the same fashion? – Uddin Oct 20 '20 at 17:22
  • 1
    You can do something like this using purrr package's "reduce" function: list(df1, df2, df3) %>% reduce(inner_join, by = "ID") – Karthik S Oct 20 '20 at 17:28