-1

I have a vector with a list of lists for each observation:

"Alcohol Dependence (F10.20),Hep B (Z22.51),Hep C (Z22.52),Opioid Abuse (F11.19),Pain- Back, low (M54.5),Pain- Back, upper (M54.9),Respiratory- Tuberculosis (TB) (A15.9)"

I am trying to remove the parentheses and everything in between, but I can't figure out the regex expression to use here. I am using the stringr package, and str_replace_all function. Any help would be greatly appreciated!

2 Answers2

1

This does exactly what is requested - it removes the parentheses and everything in between

str_replace_all(text_line, "\\([^\\)]*\\)", "")

You might also wish to remove the space(s) before the parantheses start:

str_replace_all(text_line, " *\\([^\\)]*\\)", "")
Melissa Key
  • 4,476
  • 12
  • 21
0

You can use this regular expression:

\s*\(.*?\)\s*

And replace with an empty string "".

str_replace_all(your_string, "\\s*\\(.*?\\)\\s*", "")

Demo

Explanation:

  • The middle part \(.*?\) uses a lazy quantifier (*?) so it will stop as soon as a ) is matched. The \\s* at the start and end matches the spaces so you don't get extra spaces in the result.
Sweeper
  • 213,210
  • 22
  • 193
  • 313