I'm trying to parse all the instances of a name and a last name from a string in an outlook "to" convention, and save each one in a python list. I'm using Python 3.6.4.
For example, I would like the folllowing string:
"To: John Lennon <John.Lennon@gmail.com> \b002; Paul McCartney <Paul.McCartney@yahoo.com> \b002;"
to be parsed into:
['John Lennon','Paul McCartney']
I used Replace all words from word list with another string in python as a reference and came up with this code:
import re
prohibitedWords = [r'to:',r'To:','\b002',"\<(.*?)\>"]
mystring = 'To: John Lennon <John.Lennon@gmail.com> \b002; Paul McCartney <Paul.McCartney@yahoo.com> \b002;'
big_regex = re.compile('|'.join(prohibitedWords))
the_message = big_regex.sub("", str(mystring)).strip()
print(the_message)
However, I'm getting the following results:
John Lennon ; Paul McCartney ;
This is not optimal as I'm getting lots of spaces which I cannot parse. In addition, I have a feeling this is not the optimal approach for this.
Appreciate any advice.
Thanks