0

I need to remove a single closed parentheses from a string to fix an edge case in a simpler regex problem.

I need to remove text from within parentheses, but the solution I am currently using doesn't handle an extra single closed parentheses well. Should I use a different approach or can I add an extra step to handle this case?

Below is an example where all answers should be brother & I highlighted the line that it fails on below

cleaner = function(x){
  x = tolower(x)
  ## if terms are in brackets - assume this is an alternative and remove
  x = stringr::str_remove_all(x, "\\(.*\\)")
  ## if terms are seperated by semi-colons or commas, take the first, assume others are alternatives and remove
  x = gsub("^(.*?)(,|;).*", "\\1", x)
  ## remove whitespace
  x = stringi::stri_replace_all_charclass(x, "\\p{WHITE_SPACE}", "")
  x
}

cleaner("brother(bro)")
cleaner("brother;bro")
cleaner("bro   ther")
cleaner("(bro)brother   ;bro")
cleaner("(bro)brother   ;bro)") ## this fails
cleaner("(bro)brother   ;(bro") # this doesnt

stringr::str_remove_all("(bro)brother   ;bro)", "\\(.*\\)")

Thanks,

Sam

SamPassmore
  • 1,221
  • 1
  • 12
  • 32
  • 2
    `\(.*\)` matches `(` and then all text up to the last `)`. Use lazy `.*?` instead of `.*`. Or just `"\\([^()]*\\)"`, see [my answer with R solution](https://stackoverflow.com/a/40621332/3832970). – Wiktor Stribiżew Oct 02 '19 at 10:31
  • yes this works. I had not come across this terminology before. – SamPassmore Oct 02 '19 at 10:37

0 Answers0