Get every email of random pages

Question

I want to get every email address of 1000 webpages using Python's Selenium.

My idea:

a = driver.page_source

But however I cant get that part from a.

Check my answer, it should work. If it helped you can accept it to mark it as a working asnwer. — FLAK-ZOSO, Feb 06 '22 at 10:18

score 1 · Accepted Answer · answered Feb 06 '22 at 10:07

You can get a list of the links this way:

links = [elem.get_attribute('href') for elem in elems]

where elems is a driver.find_elements_by_...() returned value, for example:

elem = driver.find_elements_by_css_selector('a') # You need <a> tags if you want to be sure to find href attribute

You can check if it's an email this way:

def isMail(link: str):
    if ('mailto:' in link):
        return True
    return False

So

mails = [link.removeprefix('mailto:') for link in links if isMail(link)]

I would suggest to read also this and this.

1 Answers1