-1

I have a dataframe of unique keys and values called df1:

df1
Key       Value
Art_1     1342
Art_2     432 
Art_3     42342 
Art_4     3244 
Art_5     6757 
Art_6     432 
....
Art_7262  6757

I want to use that table to assign values to df2. It has 1 million entries.

df2
Key       Value
Art_1     1342
Art_1     NA 
Art_1     NA 
Art_2     NA 
Art_2     6757 
Art_2     NA 
...
Art_600  NA
Art_600  NA
Art_600  6757
Art_600  6757
...
Art_7262  6757
Art_7262  NA

Currently I'm doing the following:

for (i in 1:nrow(df1)){
  df2[(df2$key==df1$key[i]),]$value <- df1$value[i]
}

It's extremely slow. The final output should be:

df2
Key       Value
Art_1     1342
Art_1     1342 
Art_1     1342 
Art_2     6757 
Art_2     6757 
Art_2     6757 
...
Art_600  6757
Art_600  6757
Art_600  6757
Art_600  6757
...
Art_7262  6757
Art_7262  6757
dank
  • 303
  • 4
  • 20
  • 1
    This is a just a `merge` operation - take a look at the examples here - https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right – thelatemail Jan 30 '18 at 22:23
  • My brain completely forgot about merging. crazy – dank Jan 30 '18 at 22:26
  • I was just typing a solution when this got closed as a dupe. Another possibility is `library(zoo); na.locf(df2);` which requires `df2` to have at least one entry for every key (and that the entries are sorted). See `?zoo::na.locf` for details. – Maurits Evers Jan 30 '18 at 22:38

3 Answers3

1

You can use df1 as a look-up table

row.names(df1) = df1$Key
df2$Value = df1$Value[df2$Key]
G5W
  • 36,531
  • 10
  • 47
  • 80
1

Using the match function in the correct order will make things nice and quick for you :)

keymatches <- match(df2$Key, df1$Key)
df2$Value <- df1$Value[keyMatches]

match just returns the position in the vector that it finds the match. So in this case it would return c(1, 1, 1, 2, 2, 2) etc. We then use that as the index for what we want to pull out in df1.

LachlanO
  • 1,152
  • 8
  • 14
1

Wtih tidyverse

library(tidyverse)
df2 <- left_join(df2$key, df1, by = "Key")
Nicolás Velasquez
  • 5,623
  • 11
  • 22