0

i would like to get the email-address from this site: https://irglobal.com/advisor/angus-forsyth

I tried it with the following code:

import time
import os
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager

if __name__ == '__main__': 
  WAIT = 1  
  print(f"Checking Browser driver...")
  os.environ['WDM_LOG'] = '0' 
  options = Options()
  options.add_argument("start-maximized")
  options.add_experimental_option("prefs", {"profile.default_content_setting_values.notifications": 1})    
  options.add_experimental_option("excludeSwitches", ["enable-automation"])
  options.add_experimental_option('excludeSwitches', ['enable-logging'])
  options.add_experimental_option('useAutomationExtension', False)
  options.add_argument('--disable-blink-features=AutomationControlled') 
  srv=Service(ChromeDriverManager().install())
  driver = webdriver.Chrome (service=srv, options=options)    
  waitWD = WebDriverWait (driver, 10)         
  
  link = "https://irglobal.com/advisor/angus-forsyth"
  print(f"Working for {link}")  
  driver.get (link)     
  time.sleep(WAIT) 
  soup = BeautifulSoup (driver.page_source, 'lxml')      
  tmp = soup.find("a", {"class": "btn email"})   
  print(tmp.prettify())
  driver.quit()

But i can´t see any email in this html-tag:

(selenium) C:\DEV\Fiverr\TRY\saschanielsen>python tmp2.py
Checking Browser driver...
Working for https://irglobal.com/advisor/angus-forsyth
<a class="btn email" data-id="103548" href="#">
 <svg aria-hidden="true" class="svg-inline--fa fa-envelope" data-fa-i2svg="" data-icon="envelope" data-prefix="fas" focusable="false" role="img" viewbox="0 0 512 512" xmlns="http://www.w3.org/2000/svg">
  <path d="M48 64C21.5 64 0 85.5 0 112c0 15.1 7.1 29.3 19.2 38.4L236.8 313.6c11.4 8.5 27 8.5 38.4 0L492.8 150.4c12.1-9.1 19.2-23.3 19.2-38.4c0-26.5-21.5-48-48-48H48zM0 176V384c0 35.3 28.7 64 64 64H448c35.3 0 64-28.7 64-64V176L294.4 339.2c-22.8 17.1-54 17.1-76.8 0L0 176z" fill="currentColor">
  </path>
 </svg>
 <!-- <i class="fas fa-envelope"></i> -->
</a>

When i click on the button manually on the site:

enter image description here

i can see the email-address in the opened email-program:

enter image description here

How can i get this email-address?

This should now only work for the specific link: https://irglobal.com/advisor/angus-forsyth

This should also work for any person on this site - so i need the information which is behind this mail-icon: https://irglobal.com/advisor/ns-shastri/ https://irglobal.com/advisor/adriana-posada/ etc.

Rapid1898
  • 895
  • 1
  • 10
  • 32

1 Answers1

1

As an alternative to the email-address in the opened email-program, you can also click and open the respective url in the adjascent tab and print the email-address inducing WebDriverWait for visibility_of_all_elements_located() using the following locator strategy:

  • Code Block:

    driver.get("https://irglobal.com/advisor/angus-forsyth/")
    parent_window = driver.current_window_handle
    driver.execute_script("scroll(0, 250);")
    element = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//h1//following::a[1]"))).click()
    all_windows = driver.window_handles
    new_window = [window for window in all_windows if window != parent_window][0]
    driver.switch_to.window(new_window)
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[contains(., 'Email')]//a"))).text)
    driver.close()
    driver.switch_to.window(parent_window)
    
  • Console Output:

    angus@angfor.hk
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Thanks for the solution but this only works for this one specific link. For other sites like https://irglobal.com/advisor/adriana-posada/ this of cours would not work cause the detail-homepage is different. Is there any way to get the email also from the mail-icon like described in my question? – Rapid1898 Jun 27 '23 at 08:20
  • I also edited my question - so this should be more clear now – Rapid1898 Jun 27 '23 at 08:23