Match and replace words in char-vector

Question

I have a vector with text lines in it, like this:

text<-c("Seat 1: 7e7389e3 ($2 in chips)","Seat 3: 6786517b ($1.67 in chips)","Seat 4: 878b0b52 ($2.16 in chips)","Seat 5: a822375 ($2.37 in chips)","Seat 6: 7a6252e6 ($2.51 in chips)")

And I have to replace some words with other words, that i have in a dataframe like this:

df<-data.frame(codigo=c("7e7389e3","6786517b","878b0b52","a822375","7a6252e6"),
name=c("lucas","alan","ivan","lucio","donald"))

So I would like to 1) Grab the first line of "text" 2) Check if there is any word to replace in df 3) Replace it 4) Do the same with the next "text" line and so on. In order to have something like this:

[1] "Seat 1: lucas ($2 in chips)"
[2] "Seat 3: alan ($1.67 in chips)"
[3] "Seat 4: ivan ($2.16 in chips)"
[4] "Seat 5: lucio ($2.37 in chips)"
[5] "Seat 6: donald ($2.51 in chips)"

There is any formula to do this?

score 3 · Answer 1 · answered Oct 29 '20 at 22:09

We can do this easily with str_replace_all which can take a named vector

library(stringr)
library(tibble)
str_replace_all(text, deframe(df))
#[1] "Seat 1: lucas ($2 in chips)"  
#[2] "Seat 3: alan ($1.67 in chips)" 
#[3]  "Seat 4: ivan ($2.16 in chips)"  
#[4] "Seat 5: lucio ($2.37 in chips)" 
#[5] "Seat 6: donald ($2.51 in chips)"

score 3 · Answer 2 · answered Oct 29 '20 at 23:24

A base R option using sapply + gsub + Vectorize

unname(sapply(text,function(x) (u <- Vectorize(gsub)(df$codigo,df$name,x,fixed = TRUE))[u!=x]))

which gives

[1] "Seat 1: lucas ($2 in chips)"     "Seat 3: alan ($1.67 in chips)"
[3] "Seat 4: ivan ($2.16 in chips)"   "Seat 5: lucio ($2.37 in chips)"
[5] "Seat 6: donald ($2.51 in chips)"

score 2 · Answer 3 · answered Oct 30 '20 at 00:14

Cases like this are a perfect occasion to use a for loop. It's boring, but it works, and is reasonably competitive in efficiency terms as per this previous question - regex for preserving case pattern, capitalization

out <- text
for (i in seq_len(nrow(df)) ) {
    out <- gsub(df$codigo[i], df$name[i], out)
}
out
#[1] "Seat 1: lucas ($2 in chips)"     "Seat 3: alan ($1.67 in chips)"  
#[3] "Seat 4: ivan ($2.16 in chips)"   "Seat 5: lucio ($2.37 in chips)" 
#[5] "Seat 6: donald ($2.51 in chips)"

score 1 · Answer 4 · answered Oct 29 '20 at 19:24

Try this approach using tidyverse functions. It looks like, if there is a pattern with : and (, you can assign a common split element and the separate by column, join with df and finally concatenate the strings to get the expected result. Here the code:

library(tidyverse)
res <- text %>% as.data.frame %>% setNames(.,'v1') %>%
  mutate(v1=gsub(': ','*',v1),
         v1=gsub(' (','*',v1,fixed=T)) %>%
  separate(v1,c('Var1','codigo','Var3'),sep='\\*') %>%
  left_join(df) %>%
  mutate(Out=paste0(Var1,': ',name,' (',Var3)) %>%
  select(Out)

Output:

                              Out
1     Seat 1: lucas ($2 in chips)
2   Seat 3: alan ($1.67 in chips)
3   Seat 4: ivan ($2.16 in chips)
4  Seat 5: lucio ($2.37 in chips)
5 Seat 6: donald ($2.51 in chips)

Match and replace words in char-vector

4 Answers4

Linked