2

I have a vector with text lines in it, like this:

text<-c("Seat 1: 7e7389e3 ($2 in chips)","Seat 3: 6786517b ($1.67 in chips)","Seat 4: 878b0b52 ($2.16 in chips)","Seat 5: a822375 ($2.37 in chips)","Seat 6: 7a6252e6 ($2.51 in chips)")

And I have to replace some words with other words, that i have in a dataframe like this:

df<-data.frame(codigo=c("7e7389e3","6786517b","878b0b52","a822375","7a6252e6"),
name=c("lucas","alan","ivan","lucio","donald"))

So I would like to 1) Grab the first line of "text" 2) Check if there is any word to replace in df 3) Replace it 4) Do the same with the next "text" line and so on. In order to have something like this:

[1] "Seat 1: lucas ($2 in chips)"
[2] "Seat 3: alan ($1.67 in chips)"
[3] "Seat 4: ivan ($2.16 in chips)"
[4] "Seat 5: lucio ($2.37 in chips)"
[5] "Seat 6: donald ($2.51 in chips)"

There is any formula to do this?

Ivan Cereghetti
  • 300
  • 1
  • 9

4 Answers4

3

We can do this easily with str_replace_all which can take a named vector

library(stringr)
library(tibble)
str_replace_all(text, deframe(df))
#[1] "Seat 1: lucas ($2 in chips)"  
#[2] "Seat 3: alan ($1.67 in chips)" 
#[3]  "Seat 4: ivan ($2.16 in chips)"  
#[4] "Seat 5: lucio ($2.37 in chips)" 
#[5] "Seat 6: donald ($2.51 in chips)"
akrun
  • 874,273
  • 37
  • 540
  • 662
3

A base R option using sapply + gsub + Vectorize

unname(sapply(text,function(x) (u <- Vectorize(gsub)(df$codigo,df$name,x,fixed = TRUE))[u!=x]))

which gives

[1] "Seat 1: lucas ($2 in chips)"     "Seat 3: alan ($1.67 in chips)"
[3] "Seat 4: ivan ($2.16 in chips)"   "Seat 5: lucio ($2.37 in chips)"
[5] "Seat 6: donald ($2.51 in chips)"
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
2

Cases like this are a perfect occasion to use a for loop. It's boring, but it works, and is reasonably competitive in efficiency terms as per this previous question - regex for preserving case pattern, capitalization

out <- text
for (i in seq_len(nrow(df)) ) {
    out <- gsub(df$codigo[i], df$name[i], out)
}
out
#[1] "Seat 1: lucas ($2 in chips)"     "Seat 3: alan ($1.67 in chips)"  
#[3] "Seat 4: ivan ($2.16 in chips)"   "Seat 5: lucio ($2.37 in chips)" 
#[5] "Seat 6: donald ($2.51 in chips)"
thelatemail
  • 91,185
  • 12
  • 128
  • 188
1

Try this approach using tidyverse functions. It looks like, if there is a pattern with : and (, you can assign a common split element and the separate by column, join with df and finally concatenate the strings to get the expected result. Here the code:

library(tidyverse)
res <- text %>% as.data.frame %>% setNames(.,'v1') %>%
  mutate(v1=gsub(': ','*',v1),
         v1=gsub(' (','*',v1,fixed=T)) %>%
  separate(v1,c('Var1','codigo','Var3'),sep='\\*') %>%
  left_join(df) %>%
  mutate(Out=paste0(Var1,': ',name,' (',Var3)) %>%
  select(Out)

Output:

                              Out
1     Seat 1: lucas ($2 in chips)
2   Seat 3: alan ($1.67 in chips)
3   Seat 4: ivan ($2.16 in chips)
4  Seat 5: lucio ($2.37 in chips)
5 Seat 6: donald ($2.51 in chips)
Duck
  • 39,058
  • 13
  • 42
  • 84