0

I tried all the methods I found here https://selenium-python.readthedocs.io/locating-elements.html but I couldn't get that link from the page.

I need to get the link from the href

HTML page code:

<p style="margin: 0px 30px 30px 30px;text-align: center;-webkit-box-sizing: border-box;max-width: 100%;-moz-box-sizing: border-box;box-sizing: border-box;">
                                <!--[if mso]>
                                    <v:roundrect xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w="urn:schemas-microsoft-com:office:word" href="https://u.pcloud.com/track?url=aHR0cHM6Ly91LnBjbG91ZC5jb20vPyNwYWdlPXZlcmlmeW1haWwmY29kZT1UTE9TN1pUWXN6MmxPSWh2bWhZUU9jd2RxSHk3bUJoejk3&token=j7yZTLOS7Z7Z87Zh354dzp5jNRuIf7aJshX1XzSehQX" style="height:49px;v-text-anchor:middle;width:289px;" arcsize="7%" strokecolor="#88CC17" fillcolor="#88CC17">
                                        <w:anchorlock/>
                                        <center style="color:#ffffff;font-family:sans-serif;font-size:13px;font-weight:bold;">CLICK TO VERIFY EMAIL</center>
                                    </v:roundrect>
                                <![endif]-->
                                <a href="https://u.pcloud.com/track?url=aHR0cHM6Ly91LnBjbG91ZC5jb20vPyNwYWdlPXZlcmlmeW1haWwmY29kZT1UTE9TN1pUWXN6MmxPSWh2bWhZUU9jd2RxSHk3bUJoejk3&token=j7yZTLOS7Z7Z87Zh354dzp5jNRuIf7aJshX1XzSehQX" style="color: #FFF;background-color: #88CC17;text-decoration: none;width: 285px;font-weight: 500;display: inline-block;padding: 13px 0px 13px 0px;border: 2px solid #88CC17;border-radius: 3px;-moz-border-radius: 3px;-webkit-border-radius: 3px;mso-hide:all;">
                                    CLICK TO VERIFY EMAIL
                                </a>

Some of the unsuccessful attempts:

link_ativador = navegador.find_element(By.LINK_TEXT, 'CLICK TO VERIFY EMAIL')
print(link_ativador)

link_ativador = navegador.find_element(By.XPATH, '/html/body/table/tbody/tr[2]/td/table/tbody/tr[3]/td/table/tbody/tr/td/p[3]/a').get_attribute('href')
print(link_ativador)

link_ativador = navegador.find_element(By.LINK_TEXT, 'CLICK TO VERIFY EMAIL').get_attribute('href')
print(link_ativador)

+Some of the unsuccessful attempts:

link_try1 = WebDriverWait(navegador, 20).until(EC.presence_of_element_located((By.XPATH, '//a[text()="CLICK TO VERIFY EMAIL"]'))).get_attribute('href')
print(link_try1)

link_try2 = (WebDriverWait(navegador, 20).until(EC.visibility_of_element_located((By.PARTIAL_LINK_TEXT, "CLICK TO VERIFY EMAIL"))).get_attribute("href"))
print(link_try2)

link_try3 = (WebDriverWait(navegador, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[contains(., 'CLICK TO VERIFY EMAIL')]"))).get_attribute("value"))
print(link_try3)

The full code:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

# Chrome e Proxy Tor
servico = Service(ChromeDriverManager().install())
proxy = "socks5://127.0.0.1:9150"  # Tor
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument(f"--proxy-server={proxy}")
navegador = webdriver.Chrome(service=servico, options=chrome_options)

navegador.get('https://tmail.link/inbox/xxxxx.xxxxxx@tmail.link/')
Dreker
  • 3
  • 2

2 Answers2

1

You should be able to get that href with the following locator:

link_url = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.XPATH, '//a[text()="CLICK TO VERIFY EMAIL"]'))).get_attribute('href')

You will also need to import:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

Selenium docs: https://www.selenium.dev/documentation/

EDIT: Upon OP's confirmation of the actual url, we can see that link is in an iframe, so the following code will retrieve it:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys

chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1280,720")

webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)
actions = ActionChains(browser)

url = 'https://tmail.link/inbox/ambers.rushing@tmail.link/b2b20ce5eeb40a74a8c4ce7d4438883fa44a69c1/'
browser.get(url) 

WebDriverWait(browser, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH, "//iframe[@src='1']")))
specific_url = WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.PARTIAL_LINK_TEXT, "CLICK TO VERIFY EMAIL"))).get_attribute("href")
print(specific_url)

This will print in terminal:

switched to iframe
https://u.pcloud.com/track?url=aHR0cHM6Ly91LnBjbG91ZC5jb20vPyNwYWdlPXZlcmlmeW1haWwmY29kZT1UTE9TN1pUWXN6MmxPSWh2bWhZUU9jd2RxSHk3bUJoejk3&token=j7yZTLOS7Z7Z87Zh354dzp5jNRuIf7aJshX1XzSehQX

The setup in the code above is chrome/chromedriver on linux, however you can adapt it to your own, just observe the imports, and the code after defining the browser/driver.

Barry the Platipus
  • 9,594
  • 2
  • 6
  • 30
  • Traceback (most recent call last): File "C:\Users\Dreker\PycharmProjects\pCloud-Invite-Selenium\teste.py", line 18, in link_try1 = WebDriverWait(navegador, 20).until(EC.presence_of_element_located((By.XPATH, '//a[text()="CLICK TO VERIFY EMAIL"]'))).get_attribute('href') File "C:\Users\Dreker\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\support\wait.py", line 89, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: – Dreker Aug 16 '22 at 00:02
  • can you confirm the url of the page you are scraping/interacting with? – Barry the Platipus Aug 16 '22 at 00:03
  • @Dreker check the updated response. it will work. – Barry the Platipus Aug 16 '22 at 11:35
  • https://tmail.link/inbox/ambers.2rushing@tmail.link/3d900707995d658df2c451771fa0e0f9048928e6/ – Dreker Aug 16 '22 at 11:40
  • Do you know if I can use OR? The email sometimes comes in a language other than en-us. (By.PARTIAL_LINK_TEXT, "CLIQUE PARA VERIFICAR O SEU E-MAIL" or "CLICK TO VERIFY EMAIL"))).get_attribute("href") – Dreker Aug 16 '22 at 12:24
  • yes you can, check this question: https://stackoverflow.com/questions/54498277/selenium-use-multiple-strings-in-find-element-by-partial-link-text Syntax is a bit deprecated there, but you can see the logic of it. – Barry the Platipus Aug 16 '22 at 12:32
0

The element is a dynamic element so extract the value of the href to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

  • Using PARTIAL_LINK_TEXT:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.PARTIAL_LINK_TEXT, "CLICK TO VERIFY EMAIL"))).get_attribute("href"))
    
  • Using XPATH:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[contains(., 'CLICK TO VERIFY EMAIL')]"))).get_attribute("value"))
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in Python Selenium - get href value

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Traceback (most recent call last): File "C:\Users\Dreker\PycharmProjects\pCloud-Invite-Selenium\teste.py", line 21, in link_try2 = (WebDriverWait(navegador, 20).until(EC.visibility_of_element_located((By.PARTIAL_LINK_TEXT, "CLICK TO VERIFY EMAIL"))).get_attribute("href")) File "C:\Users\Dreker\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\support\wait.py", line 89, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Mess – Dreker Aug 16 '22 at 00:04
  • Check with xpath please – undetected Selenium Aug 16 '22 at 07:19