1

i have a email template which is having email context in html formate,

now i wanted to find the zip number from the email html content,

for that i have used regex to search the zip code, the content is like Formate 1:

helllo this is the mail  which will converted in the lead 
and here is some addresss  which will not be used..

and the 
zip: 364001
city: New york

formate 2:

<p><b>Name</b></p><br/>
fname
<p><b>Last Name</b></p><br/>
lname
<p><b>PLZ</b></p><br/>
71392
<p><b>mail</b></p><br/>
heliconia72@mail.com

the code looks like,

regex = r'(?P<zip>Zip:\s*\d\d\d\d\d\d)'
zip_match = re.search(regex, mail_content) # find zip
zip_match.groups()[0]

this is searching for fomate 2 only, how can i write a regex so it work for both the formate.

OpenCurious
  • 2,916
  • 5
  • 22
  • 25

1 Answers1

1

If you really need to use regex for this (I would probably use BeautifulSoup for the second), you could use this for example:

regex = r'(?:zip:\s*|PLZ</b></p><br/>\n)(\d{5})'
zip_match = re.search(regex1, mail_content)
zip_match.groups()[0]
Dave Halter
  • 15,556
  • 13
  • 76
  • 103