0

I need to automatically download using Python a .csv file that is in this web page:

https://pace.coe.int/en/aplist/committees/9/commission-des-questions-politiques-et-de-la-democratie

Now, I have written this code:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
import time
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
 
chromedriver_path = r"./driver/chromedriver"
browser = webdriver.Chrome(executable_path=chromedriver_path)
url = "https://pace.coe.int/en/aplist/committees/9/commission-des-questions-politiques-et-de-la-democratie"
topics_xpath = '//*[@id="challenge-stage"]/div/label/span[2]'
browser.get(url)
time.sleep(5)  #Wait a little for page to load.
escolhe = browser.find_element("xpath", topics_xpath)
time.sleep(5)
escolhe.click()
time.sleep(5)

The web page opens up and I am then prompted to click on "Verify you are human":

enter image description here

I have "inspected" the button and copied the xpath (see code above). But I get this error:

NoSuchElementException: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@id="challenge-stage"]/div/label/span[2]"}
  (Session info: chrome=114.0.5735.198)

Can anyone help me, please?

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Giampaolo Levorato
  • 1,055
  • 1
  • 8
  • 22
  • Did you considere that whoever implemented that humanity check would do their very best to make clicking it automatically impossible or as hard as possible? And that they might be opposed to you doing that, even if you got a solution to do it? – Yunnosch Aug 17 '23 at 12:10

2 Answers2

4

The associated with the text Verify you are human element is within an <iframe> so you have to:

  • Induce WebDriverWait for the desired frame to be available and switch to it.

  • Induce WebDriverWait for the desired element to be clickable.

  • You can use either of the following locator strategies:

    • Using CSS_SELECTOR:

      driver.get("https://pace.coe.int/en/aplist/committees/9/commission-des-questions-politiques-et-de-la-democratie")
      time.sleep(5)
      WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[title='Widget containing a Cloudflare security challenge']")))
      WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "label.ctp-checkbox-label"))).click()
      
    • Using XPATH:

      driver.get("https://pace.coe.int/en/aplist/committees/9/commission-des-questions-politiques-et-de-la-democratie")
      time.sleep(5)
      WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[@title='Widget containing a Cloudflare security challenge']")))
      WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//label[@class='ctp-checkbox-label']"))).click()
      
  • Note : You have to add the following imports :

     from selenium.webdriver.support.ui import WebDriverWait
     from selenium.webdriver.common.by import By
     from selenium.webdriver.support import expected_conditions as EC
    

Reference

You can find a couple of relevant discussions in:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • 2
    Curious: Does this actually defeat the non-automation check? If so, it would seem to be a rather weak check. – kjhughes Jun 28 '23 at 17:50
1

Here's a complete SeleniumBase script for bypassing Cloudflare on that site.

pip install seleniumbase, then run with python:

from seleniumbase import SB

def verify_success(sb):
    sb.assert_element('img[alt="Logo Assembly"]', timeout=8)
    sb.sleep(4)

with SB(uc_cdp=True, guest_mode=True) as sb:
    sb.open("https://pace.coe.int/en/aplist/committees/9/commission-des-questions-politiques-et-de-la-democratie")
    try:
        verify_success(sb)
    except Exception:
        if sb.is_element_visible('input[value*="Verify"]'):
            sb.click('input[value*="Verify"]')
        elif sb.is_element_visible('iframe[title*="challenge"]'):
            sb.switch_to_frame('iframe[title*="challenge"]')
            sb.click("span.mark")
        else:
            raise Exception("Detected!")
        try:
            verify_success(sb)
        except Exception:
            raise Exception("Detected!")

It will only click the checkbox if necessary. Use sb.driver to access the raw driver. The script tries to avoid detection altogether.

Michael Mintz
  • 9,007
  • 6
  • 31
  • 48