1

I have some text string in the following pattern.

x = "sdfwervd \calculus{fff}{\trt{sdfsdf} & \trt{sdfsdf} & \trt{sdfsdf} \\{} sdfsdf & sdfsdf & sefgse3 } aserdd wersdf sewtgdf"
  1. I want to use regex to capture the text "fff" in the string \calculus{fff} and replace it with something else.

  2. Further I want to capture the string between the first { after \calculus{.+} and it's corresponding closing curly brace }.

How to do this with regex in R ?

The following captures everything till last curly brace.

gsub("(\\calculus\\{)(.+)(\\})", "", x)
Crops
  • 5,024
  • 5
  • 38
  • 65
  • First of all, backslashes in your `x` string literals must be doubled (not sure as for `\\{}` though, `"\\{}"` => `\{}`). Then, the backslash before `c` can be matched with 2 literal backslashes, that is, 4 backslashes in the string literal. To match as few chars as possible, replace `.*` with `.*?`. `gsub` replaces the match with some replacement, you are removing the whole match, and only the match. – Wiktor Stribiżew Apr 03 '18 at 09:53
  • Can try something like gsub("(\\calculus\\{)(.+)(\\})", "(\\calculus\\{)(###)(\\})", x) – Guillaume Ottavianoni Apr 03 '18 at 09:55
  • 2
    See http://rextester.com/MQHT68999 – Wiktor Stribiżew Apr 03 '18 at 10:03

1 Answers1

2

For the second task you can use a recursive approach in combination with regmatches() and gregexpr() in base R:

x <- c("sdfwervd \\calculus{fff}{\\trt{sdfsdf} & \\trt{sdfsdf} & \\trt{sdfsdf} \\{} sdfsdf & sdfsdf & sefgse3 } aserdd wersdf sewtgdf")

pattern <- "\\{(?:[^{}]*|(?R))*\\}"
(result <- regmatches(x, gregexpr(pattern, x, perl = TRUE)))


This yields a list of the found submatches:
[[1]]
[1] "{fff}"                                                                          
[2] "{\\trt{sdfsdf} & \\trt{sdfsdf} & \\trt{sdfsdf} \\{} sdfsdf & sdfsdf & sefgse3 }"

See a demo for the expression on regex101.com.

Jan
  • 42,290
  • 8
  • 54
  • 79
  • How to remove the curly braces from the second pattern alone and modify it ? `\\calculus{fff}{anystring}` to `\\calculus{fff}##anystring$$` https://regex101.com/r/vduvHi/3 – Crops Apr 03 '18 at 13:04
  • Either use `substr(your_string, 1, nchar(your_string) - 1)` or captured groups. – Jan Apr 03 '18 at 13:11