-1

I am looking for regex that will allow me to extract the names and drop everything inside the parentheses. Example data below.

Text string:

John (Juan, Jonathan, Jon, Jonny) James Doe (born on January 1, 1900)

Desired output:

John James Doe

Further in some cases the text string may be like:

John (Juan, Jonathan, Jon, Jonny) James Doe (born on January 1, 1900) (Canada)

and in this case we would still want returned:

John James Doe

I tried the solution from the linked question, but I still get the wrong output:

John James Doe (born on January 1, 1900)
pptaszni
  • 5,591
  • 5
  • 27
  • 43
Stpete111
  • 3,109
  • 4
  • 34
  • 74
  • 2
    Running the solution from the thread you linked works for multiple parentheses when I test. Maybe you need to clarify why that doesn't work? – jarcobi889 Aug 18 '20 at 21:58
  • 1
    It is not returning what he asked for. It is only selected the word between the parenteses. The answer he linked is not suitable. That's why he post it. – vincent PHILIPPE Aug 18 '20 at 22:00
  • 1
    @jarcobi889 when I run the solution from the linked thread, the string I get returned is `John James Doe (born on January 1, 1900)` – Stpete111 Aug 18 '20 at 22:02
  • 1
    Are you running this? `text = "John (Juan, Jonathan, Jon, Jonny) James Doe (born on January 1, 1900) (Canada)"` `re.sub(r'\([^)]*\)', '', text)` `'John James Doe '` ? That's the output I get using the linked question. – jarcobi889 Aug 18 '20 at 22:08

1 Answers1

0

Using regex only and not any replace function :

[^\S]*(\w*)(?:\s*)(?:\([^()]*\))*

https://regex101.com/r/zF2cMM/4

Edit :

(?:[^\S]*(\w*)(?:\s*)(?:\s*\([^()]*\)\S*)*)

I have made this last one version correcting a problem on last match

You can compare V4 and V6 and see that the result is a little bit different

https://regex101.com/r/zF2cMM/6

Now this work fine.

vincent PHILIPPE
  • 975
  • 11
  • 26