3

I am looking for a manner to extract the position of "*" and "+" symbols in R string.

test <- "x+y"
unlist(gregexpr("+", test))
[1] 1 2 3
unlist(gregexpr("y", test))
[1] 3

It returns the position of x or y but returns all positions for + or *.

Thank you!

POC
  • 268
  • 1
  • 2
  • 7
  • since you are `unlist`ing your results, it seems that you only have one expression per line, hence use `regexpr` instead of `gregexpr`. For the case of `+` symbol, you just need to escape it as it is a metacharacter: `regexpr('\\+', test)` – Onyambu Dec 27 '21 at 20:36
  • Another way to escape it is `[+]` which is a "character class" in [regex](https://stackoverflow.com/a/22944075/3358272); this is also necessary for `[*]`. Using this, you can look for *either* (not differentiating) with `[+*]`. – r2evans Dec 27 '21 at 20:38

2 Answers2

5

Use fixed = TRUE, by default it is FALSE and uses the regex mode where + is a metacharacter. According to ?regex

+ - The preceding item will be matched one or more times.

* - The preceding item will be matched zero or more times.

unlist(gregexpr("+", test, fixed = TRUE))
[1] 2
akrun
  • 874,273
  • 37
  • 540
  • 662
2

Some other base R workarounds

> which(unlist(strsplit(test, "")) == "+")
[1] 2

> which(utf8ToInt(test) == utf8ToInt("+"))
[1] 2
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81