In python, I am trying to create an algorithm that will use characters from an email address and search the page to calculate the likelihood the string is the actual name of the person. I wrote a regex expression to grab all emails on a page, but then I want to write another to try and find the persons name from the email (since it is a subset or some characters of the name make it up).
I am using:
self.reEmail = re.compile(r"\b(?!(?:.\B)*(.)(?:\B.)*\1)[char]+\b", re.IGNORECASE)
However this is giving me all single characters.
email: bjoel@email.edu
Name : Billy Joel - is what I want to scrape.
However it is not always the first letter of the email is the first name...