I'm trying to extract the phone numbers from many files of emails. I wrote regex code to extract them but I got the results for just one format.
PHONERX = re.compile("(\d{3}[-\.\s]??\d{3}[-\.\s]??\d{4}|\(\d{3}\)\s*\d{3}[-\.\s]??\d{4}|\d{3}[-\.\s]??\d{4})")
phonenumber = re.findall(PHONERX,content)
when I reviewed the data, I found there were many formats for phone numbers.
How can I extract all the phone numbers that have these format together:
800-569-0123
1-866-523-4176
(324)442-9843
(212) 332-1200
713/853-5620
713 853-0357
713 837 1749
This link is a sample for the dataset. the problem is sometime the phone numbers regex extract from the messageId and other numbers in the email https://www.dropbox.com/sh/pw2yfesim4ejncf/AADwdWpJJTuxaJTPfha38OdRa?dl=0