To extract anything before the first occurrence of a whitespace char followed by a round bracket (
you may use re.search
(this method is meant to extract the first match only):
re.search(r'^(.*?)\s\(', text, re.S).group(1)
re.search(r'^\S*(?:\s(?!\()\S*)*', text).group()
See regex #1 demo and regex #2 demos. Note the second one - though longer - is much more efficient since it follows the unroll-the-loop principle.
Details
^
- start of string
(.*?)
- Group 1: any 0+ chars as few as possible,
\s\(
- a whitespace and (
char.
Or, better:
^\S*
- start of string and then 0+ non-whitespace chars
(?:\s(?!\()\S*)*
- 0 or more occurrences of
\s(?!\()
- a whitespace char not followed with (
\S*
- 0+ non-whitespace chars
See Python demo:
import re
strs = ['Isla Vista (University of California, Santa Barbara)[2]','Carrollton (University of West Georgia)[2]','Dahlonega (North Georgia College & State University)[2]']
rx = re.compile(r'^\S*(?:\s(?!\()\S*)*', re.S)
for s in strs:
m = rx.search(s)
if m:
print('{} => {}'.format(s, m.group()))
else:
print("{}: No match!".format(s))