0

I have two dataframes, one with measurements on individuals:

> head(df1)
          ID       y1       y2
1      0493B 191.3578 17.26139
2      7393B 241.2554 30.98948
3 Ames-27398 190.8905 52.95530
4 Ames-27399 148.9887 51.09268
5 Ames-27404 162.1147 43.67760
6 Ames-27414 171.2277 42.51208

and a second which is a correspondence key between "ID" and "ID2"

> head(df2)
     ID2              ID
1 142462 Vilmorin-I04428
2 142464       Ames-7701
3 142466       Ames-7702
4 142468       Ames-7705
5 142470       Ames-7706
6 142472       Ames-7711

I want to rename the values of "ID" in df1 with their corresponding "ID2" values in df2 in the most straightforward way. My current solution:

order <- match(df1$ID, df2$ID)
key.ordered <- key[order, ]
df1$ID <- df2$ID2

seems extremely cumbersome.

shbrainard
  • 377
  • 2
  • 9
  • 1
    What is `key` ? How is it related here? Also can you give an example where `ID`'s match and show expected output? In this example none of the ID match and hence `match` returns `NA` for all. – Ronak Shah Mar 04 '20 at 02:16
  • 2
    Does this answer your question? [How to join (merge) data frames (inner, outer, left, right)](https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) – cardinal40 Mar 04 '20 at 02:17
  • @RonakShah Yea, these are just the headers of very long files. Basically I want to replace text from df1 with the corresponding number from df2. – shbrainard Mar 04 '20 at 03:23
  • Does the post from cardinal40 answers your question? I'll mark it as duplicate of that post then? – Ronak Shah Mar 04 '20 at 04:09
  • Yea, it's not quite the same question, as left.join produces somewhat different output, but it's definitely a solution! – shbrainard Mar 04 '20 at 19:40

2 Answers2

1

What I usually do for similar situations is to use the left_join() and the dplyr package.

library(dplyr)

df1 <-
  df1 %>%
  left_join(df2, by = "ID") %>%
  mutate(ID = ifelse(is.na(ID2), ID, ID2) %>%
  select(-c(ID2))
user2332849
  • 1,421
  • 1
  • 9
  • 12
1

You were close. Use match to select ID2 values of df2 to replace IDs of df1.

df1$ID.new <- df2[match(df1$ID, df2$ID), "ID2"]
#          ID          y1          y2 ID.new
# 1 gjmj-6635 -1.51010077 -0.54070894 142462
# 2 gjmj-2213  0.15032996 -0.56827851 142464
# 3 gjmj-4686  0.08240943  0.31695807 142465
# 4 qlfu-9776 -0.88209188 -0.04709116 142466
# 5 qlfu-2242  1.17386673 -0.72486079 142467
# 6 qlfu-9290 -0.54647434  1.26443570 142468

Data:

df1 <- structure(list(ID = structure(c(3L, 1L, 2L, 6L, 4L, 5L), .Label = c("gjmj-2213", 
"gjmj-4686", "gjmj-6635", "qlfu-2242", "qlfu-9290", "qlfu-9776"
), class = "factor"), y1 = c(0.405671147370875, -0.664875758352609, 
-0.292573014773075, 1.52846866859723, 0.75230586140235, -0.188505367141408
), y2 = c(0.0656388238223931, -0.764525211440881, -0.997090313648482, 
0.654033292501034, 0.832747966698076, -0.0542024033903575)), class = "data.frame", row.names = c(NA, 
-6L))

df2 <- structure(list(ID = structure(c(3L, 1L, 2L, 6L, 4L, 5L), .Label = c("gjmj-2213", 
"gjmj-4686", "gjmj-6635", "qlfu-2242", "qlfu-9290", "qlfu-9776"
), class = "factor"), ID2 = c(142462, 142464, 142465, 142466, 
142467, 142468)), class = "data.frame", row.names = c(NA, -6L
))
jay.sf
  • 60,139
  • 8
  • 53
  • 110