3

I have looked at lots of posts here on SO with suggestions on REGEX patterns to grab texts from parentheses. However, from what I have looked into I cannot find a solution that works.

For example, I have had a look at the following: R - Regular Expression to Extract Text Between Parentheses That Contain Keyword, Extract text in parentheses in R, regex to pickout some text between parenthesis [duplicate]

In the following order, here were the top answers solutions (with some amendments):

pattern1= '\\([^()]*[^()]*\\)'
pattern2= '(?<=\\()[^()]*(?=\\))'
pattern3= '.*\\((.*)\\).*'
all_patterns = c(pattern1, pattern2, pattern3)

I have used the following:

sapply(all_patterns , function(x)stringr::str_extract('I(data^2)', x))

   \\([^()]*[^()]*\\) (?<=\\()[^()]*(?=\\))        .*\\((.*)\\).* 
           "(data^2)"              "data^2"           "I(data^2)" 

None of these seem to only grab the characters within the brackets, so how can I just grab the characters inside brackets?

Expected output:

data
  • Try with a regex lookaround `str_extract('I(data^2)', '(?<=\\()[^\\^\\)]+')# [1] "data"` – akrun Jun 06 '22 at 15:49
  • For your example string `'I(data^2)'` you say that the desired result is `'data'`, but your title indicates it should be `'data^2'`. You also say `'grab text`'. Does that mean match letters only? You need to clarify (even though you have selected an answer, as your question will be read my many in future). What would be the desired result for the string `'I(3da ta^2)'`? – Cary Swoveland Jun 06 '22 at 17:20

2 Answers2

4

With str_extract, it would extract all those characters matched in the patterns. Instead, use a regex lookaround to match one or more characters that are not a ^ or the closing bracket ()) ([^\\^\\)]+) that succeeds an opening bracket ((?<=\\() - these are escaped (\\) as they are metacharacters

library(stringr)
str_extract('I(data^2)', '(?<=\\()[^\\^\\)]+')
# [1] "data"
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Why would you escape the metacharacters in a character class? ie wont `'(?<=\\()([^^)]+)'` not work? – Onyambu Jun 06 '22 at 16:09
  • @onyambu it is just to differentiate or else the `^^` would seems a bit difficult to read – akrun Jun 06 '22 at 16:10
1

Here is combinations of str_extract and str_remove

library(stringr)

str_extract(str_remove('I(data^2)', '.\\('), '\\w*')

[1] "data"
TarJae
  • 72,363
  • 6
  • 19
  • 66