mystring <- c("code IS (k(384333)\n AND parse = TURE \n ) \n
code IS (\n FROM (43343344)\n ) some information code IS \n
code IS ( ( \n (data)(23423422 \n)) ) ) and more information)")
I would like to extract all instances of code IS (...)
. But because of the nested parentheses, my regex seems to stop only after the first closed parenthesis.
library(stringr)
> str_extract_all(pattern = 'code IS \\([\\s\\S]+?\\)', mystring)
[[1]]
[1] "code IS (k(384333)" "code IS (\n FROM (43343344)" "code IS ( ( \n (data)"
The desired output is
[[1]]
[1] "code IS (k(384333)\n AND parse = TURE \n )" "code IS (\n FROM (43343344)\n )" "code IS ( ( \n (data)(23423422 \n)) )"
Edit: Potential regex solutions are here:
The question now is how do I adapt these solutions to work with str_extract_all
in R?
My attempt at using a PCRE pattern:
> str_extract_all(pattern = 'code IS \((?:[^)(]+|(?R))*+\)', mystring)
Error: '\(' is an unrecognized escape in character string starting "'code IS \("