3

I am trying to replace several strings within a string, using gsub. In the general case gsub(pattern, replacement, x) takes a character vector x , but both pattern and replacement are assumed to be characters or regular expressions. pattern and replacement cannot be a vector of characters.

I can implement this crudely using a loop but in my case both pattern and replacement are long vectors, so i am hoping for a "vectorized" implementation if possible

My current implementation is as follows:

    strs<-"apples.in.bed"
    replace.vect<-c("a","e","i")
    new.char.vect<-c("1","2","3")
    temp <- strs
    for(i in 1:length(replace.vect)){
      temp<-gsub(replace.vect[i], new.char.vect[i],temp)
    }

    # temp: "1ppl2s.3n.b2d"

However, I'd like to arrive at the same result without the use of the for loop. I have also tried using apply, but internally it seems as though all the characters are being looped over, so it does not seem to offer much in terms of performance gain;

    apply(cbind(replace.vect, new.char.vect),1,function(x) {strs<<-gsub(x[1],x[2],strs)})

Also - I have also considered the chartr function as shown here, but this function replaces characters with characters, but cannot replace strings with strings.

Any suggestions highly appreciated!

Community
  • 1
  • 1
JSB
  • 351
  • 2
  • 24

1 Answers1

2

You could try the sedit function from the Hmisc library

library(Hmisc)
sedit(text = strs, replace.vect, new.char.vect)
#[1] "1ppl2s.3n.b2d"
JoeArtisan
  • 135
  • 8
  • Have changed it now to use a different function. Hopefully this will be what you're looking for. – JoeArtisan Mar 24 '15 at 10:22
  • This will be no better than `chartr`. Try this on `replace.vect<-c("(a)","e","i")`. This *won't* return `(a)ppl2s.(a)r2.b(a)d`. Not to mention that `sedit` is exactly the same `for` loop as OPs question, so it will definitely won't improve performance which is the main goal here – David Arenburg Mar 24 '15 at 11:25
  • @DavidArenburg I take your point about `sedit` being no better than a for loop, though can you expand on your first point? From the output you said you wanted returned, you would want `new.char.vect<-c("(a)","2","3")` which works for me. @JSB You could always try parallelising your code, e.g. `library(parallel)` `newText <- mclapply(strs, sedit, replace.vect, new.char.vect, mc.cores = 4) %>% unlist` – JoeArtisan Mar 24 '15 at 12:49
  • OP wants the result to be `(a)ppl2s.(a)r2.b(a)d`. You can't see deleted posts but I've already proposed that solution using `chartr` and OP posted this comment. – David Arenburg Mar 24 '15 at 12:52