0

I have two dataframes in R that contain 3 columns:

df1 <- data.frame("Gene"=c("Myc", "Rad", "Meg", "Cdc"), "Meth"=c(13, 62, 62, 79), "Exp"=c(-4.2, 1, 0.9, -2))
df2 <- data.frame("Gene"=c("Rad", "Gnas", "Meg", "Klm"), "Meth"=c(54, 13, 05, 84), "Exp"=c(-3.2, 0, 3.9, -2))

I would like to make two versions (or at least one of the two) of the new dataframe. 1) Contains the complete df1 with the addition of two new columns that overlap with df2 Gene columns, such that:

df3 <- data.frame("Gene"=c("Myc", "Rad", "Meg", "Cdc"), "Meth"=c(13, 62, 62, 79), "Exp"=c(-4.2, 1, 0.9, -2), "Meth2"=c(NA, 54, 05, NA), "Exp2"=c(NA, -3.2, 3.9, NA))

2) Contains only the values for Genes that are in both df1 and df2:

df3 <- data.frame("Gene"=c("Rad", "Meg"), "Meth"=c(62, 62), "Exp"=c(1, 0.9), "Meth2"=c(54, 05), "Exp2"=c(-3.2, 3.9))
zx8754
  • 52,746
  • 12
  • 114
  • 209
user2165857
  • 2,530
  • 7
  • 27
  • 39

2 Answers2

1

You can use merge (also have a look here):

> merge(df1, df2, by="Gene", all.x=T)
  Gene Meth.x Exp.x Meth.y Exp.y
1  Cdc     79  -2.0     NA    NA
2  Meg     62   0.9      5   3.9
3  Myc     13  -4.2     NA    NA
4  Rad     62   1.0     54  -3.2

> merge(df1,df2, by = "Gene")
  Gene Meth.x Exp.x Meth.y Exp.y
1  Meg     62   0.9      5   3.9
2  Rad     62   1.0     54  -3.2
Community
  • 1
  • 1
user1981275
  • 13,002
  • 8
  • 72
  • 101
1

?merge can do this.

df3 <- merge(df1,df2, by = "Gene", all.x = TRUE)

df4 <- merge(df1,df2, by = "Gene")
WheresTheAnyKey
  • 848
  • 5
  • 6