3

I must imagine this question is not unique, but I was struggling with which words to search for so if this is redundant please point me to the post!

I have a dataframe

test <- data.frame(x = c("a", "b", "c", "d", "e"))

  x
1 a
2 b
3 c
4 d
5 e

And I'd like to replace SOME of the values using a separate data frame

metadata <- data.frame(
  a = c("c", "d"),
  b = c("REPLACE_1", "REPLACE_2"))

Resulting in:

  x
1 a
2 b
3 REPLACE_1
4 REPLACE_2
5 e
Sotos
  • 51,121
  • 6
  • 32
  • 66
MayaGans
  • 1,815
  • 9
  • 30
  • Have a look at [Replace column values based on column in another dataframe](https://stackoverflow.com/q/59134813/10488504), [Merge dataframes of different sizes](https://stackoverflow.com/q/34438349/10488504) or [Update join](https://stackoverflow.com/a/52170570/10488504). – GKi Feb 11 '20 at 16:03

5 Answers5

3

A base R solution using match + replace

test <- within(test,x <- replace(as.character(x),match(metadata$a,x),as.character(metadata$b)))

such that

> test
          x
1         a
2         b
3 REPLACE_1
4 REPLACE_2
5         e
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
2

Importing your data with stringsAsFactors = FALSE and using dplyr and stringr, you can do:

test %>%
 mutate(x = str_replace_all(x, setNames(metadata$b, metadata$a)))

          x
1         a
2         b
3 REPLACE_1
4 REPLACE_2
5         e

Or using the basic idea from @Sotos:

test %>%
 mutate(x = pmax(x, metadata$b[match(x, metadata$a, nomatch = x)], na.rm = TRUE))
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
2

You can do,

test$x[test$x %in% metadata$a] <- na.omit(metadata$b[match(test$x, metadata$a)])

 #         x
#1         a
#2         b
#3 REPLACE_1
#4 REPLACE_2
#5         e
Sotos
  • 51,121
  • 6
  • 32
  • 66
1

Here's one approach, though I presume there are shorter ones:

library(dplyr)
test %>%
  left_join(metadata, by = c("x" = "a")) %>%
  mutate(b = coalesce(b, x))

#  x         b
#1 a         a
#2 b         b
#3 c REPLACE_1
#4 d REPLACE_2
#5 e         e

(Note, I have made the data types match by loading metadata as character, not factors:

metadata <- data.frame(stringsAsFactors = F,
  a = c("c", "d"),
  b = c("REPLACE_1", "REPLACE_2"))
Jon Spring
  • 55,165
  • 4
  • 35
  • 53
1

You can use match to make this update join.

i <- match(metadata$a, test$x)
test$x[i]  <- metadata$b
# test
#          x
#1         a
#2         b
#3 REPLACE_1
#4 REPLACE_2
#5         e

Or:

i <- match(test$x, metadata$a)
j <- !is.na(i)
test$x[j]  <- metadata$b[i[j]]
test
#          x
#1         a
#2         b
#3 REPLACE_1
#4 REPLACE_2
#5         e

Data:

test <- data.frame(x = c("a", "b", "c", "d", "e"), stringsAsFactors = FALSE)
metadata <- data.frame(
  a = c("c", "d"),
  b = c("REPLACE_1", "REPLACE_2"), stringsAsFactors = FALSE)
GKi
  • 37,245
  • 2
  • 26
  • 48