I want to scrape emails from this website but they are protected. They are visible on the website but while scraping a protected email appears which are decoded.
I have tried scraping but got this result
<a href="/cdn-cgi/l/email-protection#d5a7bba695b9a6b0b2fbb6bab8"><span class="__cf_email__" data-cfemail="c0b2aeb380acb3a5a7eea3afad">[email protected]</span></a>
My code:
from bs4 import BeautifulSoup as bs
import requests
import re
r = requests.get('https://www.accesswire.com/api/newsroom.ashx')
p = re.compile(r" \$\('#newslist'\)\.after\('(.*)\);")
html = p.findall(r.text)[0]
soup = bs(html, 'lxml')
headlines = [item['href'] for item in soup.select('a.headlinelink')]
for head in headlines:
response2 = requests.get(head, headers=header)
soup2 = bs(response2.content, 'html.parser')
print([a for a in soup2.select("a")])
I want the emails that are in the body e.g. Email: theramedhealthcorp@gmail.com this email from this site https://www.accesswire.com/546295/Theramed-Provides-Update-on-New-Sales-Channel-for-Nevada-Facility but the email is being protected, how to scrape it in textual form like real email address? Thanks