-2

I want to replace a suffix in a string. This suffix can be either .x or .y. If its is .x I want to replace it by string1(say) if it is .y it should be replaced by string2. (the replacement strings are arbitrary, but there is a clear mapping between suffix and replacement string, e.g. .x -> .string1 and .y -> .string2).

I can easily achieve that by using 2 calls of gsub like this:

in_str <- c("a.x", "a.y")
gsub("\\.y$", ".string2", gsub("\\.x$", ".string1", in_str)))
# [1] "a.string1" "a.string2"

Question

Is there a regex with which I can achieve that with just one call of gsub? Or is there any library function with which I can replace the suffixes in one go?

DerStarkeBaer
  • 669
  • 8
  • 28
thothal
  • 16,690
  • 3
  • 36
  • 71
  • 1
    From a "clean code" perspective it would certainly be clearer to first split the string into prefix and suffix and the build the new string with an `if`. – AEF Jul 18 '19 at 08:32
  • 2
    [This](https://stackoverflow.com/questions/33949945/replace-multiple-strings-in-one-gsub-or-chartr-statement-in-r/33950268) might be of interest. – ismirsehregal Jul 18 '19 at 08:35
  • @ismirsehregal Thanks, that is exactly what I was looking for +1! For reference, code would look like this: `gsubfn("\\..$",list(".x" = ".string1", ".y"=".string2"), in_str)` – thothal Jul 18 '19 at 09:08
  • @ismirsehregal Would you mind to make it an answer, then I will accept it, otherwise I go for Ronak's solution. – thothal Jul 18 '19 at 09:15
  • 2
    @thothal If the link provides the answer to your question do you think it should be marked as duplicate of that question? Or you think these are different questions? – Ronak Shah Jul 18 '19 at 09:18
  • 1
    Yes indeed, I marked it as duplicate, thanks for pointing out. – thothal Jul 18 '19 at 09:22
  • For those interested: I made benchmark regarding this [here](https://stackoverflow.com/a/57178768/9841389) - A solution based on `library(stringi)` wins, which wasn't mentioned here. – ismirsehregal Jul 24 '19 at 08:48

2 Answers2

2

I don't think that is what the regex' are for, I would do it differently:

in_str <- c("a.x", "a.y", "b.y", "b.x")
strmap <- c(.x="string1", .y="string2")
strmap[ gsub(".*(\\.[xy])$", "\\1", in_str) ]

Result:

       .x        .y        .y        .x 
"string1" "string2" "string2" "string1" 

This has the advantage of being way more flexible and cleanly separating the definition of suffix mapping from the actual function that does the mapping. You can even automatize it more:

in_str <- c("a.x", "a.y", "b.y", "b.x")
strmap <- c(x="string1", y="string2")
suffixes <- paste0(names(strmap), collapse="")
pattern <- sprintf(".*\\.([%s])$", suffixes)
res <- strmap[ gsub(pattern, "\\1", in_str) ]
names(res) <- in_str

Result

      a.x       a.y       b.y       b.x 
"string1" "string2" "string2" "string1" 
January
  • 16,320
  • 6
  • 52
  • 74
  • 1
    Thanks for the answer. It is not exactly what I am after, as I would like to replace just the suffix not the whole string, so the result should be `c("a.string1", "a.string2", "b.string1", b.string2")` in your example. I see how I could adapt your code to get that, but I guess in this case the 2 `gsub` are even easier. – thothal Jul 18 '19 at 09:01
2

You can use mgsub which accepts multiple patterns with multiple replacements

mgsub::mgsub(in_str, c("\\.x$", "\\.y$"), c(".string1", ".string2"))
#[1] "a.string1" "a.string2"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213