1

I have strings in the following format name1 <email1@email.com>. How can I use regex to pull only the name1 part out? Also, how might I be able to do this if I had multiple such names and emails, say name1 <email1@email.com>, name2 <email2@email.com>?

supersaiyajin87
  • 151
  • 1
  • 8

3 Answers3

3

Try using split:

In [164]: s = 'name1 <email1@email.com>, name2 <email2@email.com>'
In [166]: [i.split()[0] for i in s.split(',')]
Out[166]: ['name1', 'name2']

If you have just one name:

In [161]: s = 'name1 <email1@email.com>'
In [163]: s.split()[0]
Out[163]: 'name1'
Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58
2

You can start with (\w+)\s<.*?>(?:,\s)? (see on regex101.com), which relies on the fact that emails are surrounded by < >, and customize it as you see fit.

Note that this regex does not specifically look for emails, just for text surrounded by < >.

Don't fall down the rabbit hole of trying to specifically match emails.

import re

regex = re.compile(r'(\w+)\s<.*?>(?:,\s)?')
string = 'name1 <email1@email.com>, name2 <email2@email.com>'

print([match for match in regex.findall(string)])

outputs

['name1', 'name2']
DeepSpace
  • 78,697
  • 11
  • 109
  • 154
2
import re

name = re.search(r'(?<! <)\w+', 'name1 <email@email.com>')

print(name.group(0))

>>> name1

Explanation:

(?<!...) is called a negative lookbehind assertion. I added ' <' into the ... as you are looking for the string that precedes the '<' of the email.

re.search(r'(?<!...), string_to_search)

https://docs.python.org/3/library/re.html


Edit/Forgot:

To search strings with multiple:

import re

regex = r"\w+([?<! <])"

multi_name = "name1 <email@email.com>, name2 <email@email.com>"
    
matches = re.finditer(regex, multi_name, re.MULTILINE)
    
for group, match in enumerate(matches, start=1):
    print(f"Match: {match.group()}")

>>> name1
>>> name2
nahar
  • 41
  • 5