0

I would like to scrape data from this page: https://www.investing.com/equities/nvidia-corp-financial-summary.

There are two buttons that I'd like to click:

  1. Accept button. enter image description here

Checking the XPath of the button: XPath = //*[@id="onetrust-accept-btn-handler"]

Replicating the steps performed here: Clicking a button with selenium using Xpath doesn't work

from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
    
wait = WebDriverWait(driver, 5)
link= wait.until(EC.element_to_be_clickable((By.XPATH, "//*[@id="onetrust-accept-btn-handler")))

I got the error: SyntaxError: invalid syntax

  1. Annual button there is a toggle between Annual and Quarterly (default is quarterly)

enter image description here XPath is //*[@id="leftColumn"]/div[9]/a[1]

wait.until(EC.element_to_be_clickable((By.XPATH, "//*[@id="leftColumn"]/div[9]/a[1]")))

also returned invalid Syntax.


Updated Code

from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait

company = 'nvidia-corp'
driver = webdriver.Chrome(path)
driver.get(f"https://www.investing.com/equities/{company}-financial-summary")

wait = WebDriverWait(driver, 2)
accept_link= wait.until(EC.element_to_be_clickable((By.XPATH, '//*[@id="onetrust-accept-btn-handler"]')))
accept_link.click()

scrollDown = "window.scrollBy(0,500);"
driver.execute_script(scrollDown)
#scroll down to get the page data below the first scroll

driver.maximize_window()
time.sleep(10)

wait = WebDriverWait(driver, 2)

scrollDown = "window.scrollBy(0,4000);"
driver.execute_script(scrollDown)
#scroll down to get the page data below the first scroll

try:
    close_popup_link= wait.until(EC.element_to_be_clickable((By.XPATH,'//*[@id="PromoteSignUpPopUp"]/div[2]/i')))
    close_popup_link.click()
except NoSuchElementException:
    print('No such element')
    
wait = WebDriverWait(driver, 3)
try:
    annual_link = wait.until(EC.element_to_be_clickable((By.XPATH, '//*[@id="leftColumn"]/div[9]/a[1]')))
    annual_link()
    # break
except NoSuchElementException:
    print('No element of that id present!')

The first accept button was successfully clicked, but clicking the Annual button returns Timeout Exception error.


Annual button enter image description here

Prophet
  • 32,350
  • 22
  • 54
  • 79
Luc
  • 737
  • 1
  • 9
  • 22
  • What link/element do you mean by `financial_link`? I see nothing there matching this XPath locator. – Prophet Sep 03 '22 at 18:03
  • I have updated the code. The financial_link can be ignored now. The accept privacy button works , but switching the toggle from quarterly to annual does not. – Luc Sep 03 '22 at 18:14
  • I still see nothing there matching that locator. Even nothing matching the `//*[@id="leftColumn"]`. Do you mean the `1 Year` button below the chart? – Prophet Sep 03 '22 at 18:24
  • on https://www.investing.com/equities/nvidia-corp-financial-summary , under a long paragraph of "Financial Summary" , there is the Annual | Quarterly button. This is the element of Annual button :
    "Annual" and the XPath is: //*[@id="leftColumn"]/div[9]/a[1]
    I have added a new image to make it clearer. do you find it?
    – Luc Sep 03 '22 at 18:31
  • OK, I see it. Will try to do that. – Prophet Sep 03 '22 at 19:17

2 Answers2

0

At least for me I saw we need to use another locator to access that element.
I used scrolling until I can click that element.
The following code works for me:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument("--start-maximized")

s = Service('C:\webdrivers\chromedriver.exe')

driver = webdriver.Chrome(options=options, service=s)
company = 'nvidia-corp'

wait = WebDriverWait(driver, 5)
driver.get(f"https://www.investing.com/equities/{company}-financial-summary")
try:
     wait.until(EC.element_to_be_clickable((By.XPATH, '//*[@id="PromoteSignUpPopUp"]/div[2]/i'))).click()
except:
    pass
while True:
    try:
        wait.until(EC.element_to_be_clickable((By.XPATH, "//div[@class='float_lang_base_1']//a[@data-ptype='Annual']"))).click()
        break
    except:
        driver.execute_script("window.scrollBy(0, arguments[0]);", 1000)
Prophet
  • 32,350
  • 22
  • 54
  • 79
  • for me, the code gets stuck (keeps on loading with no visible progress) after the accept button. It doesn't get pass the "sign up for free and get...", and also does not click on the toggle "Annual" instead of Quarterly. I believe the link should be https://www.investing.com/equities/{company}-financial-summary") using dash instead of using "/" before "financial-summary" – Luc Sep 03 '22 at 20:31
  • Please see the updated answer. It works – Prophet Sep 03 '22 at 20:42
  • 1
    It was some kind of misunderstanding with the URL of the page – Prophet Sep 03 '22 at 20:45
  • after adding the try-except clause for the accept button, it works. Thanks. What is the purpose of the last except clause? – Luc Sep 03 '22 at 20:55
  • 1
    `driver.execute_script("window.scrollBy(0, arguments[0]);", 1000)` is performing scroll-down for 1000 pixels. making scroll in a loop until click is performed – Prophet Sep 03 '22 at 20:57
0

You need to take care of a couple of things here as follows:

  • If you are supplying the xpath within double qoutes, i.e. "..." then the attribute values needs to be within single quotes, i.e. '...'
  • Similarly, if you are supplying the xpath within single qoutes, i.e. '...' then the attribute values needs to be within double quotes, i.e. "..."

This take care of both the SyntaxError: invalid syntax

Effectively, the lines of code will be:

link= wait.until(EC.element_to_be_clickable((By.XPATH, "//*[@id='onetrust-accept-btn-handler')))

and

wait.until(EC.element_to_be_clickable((By.XPATH, "//*[@id='leftColumn']/div[9]/a[1]")))

Solution

To click on the clickable elements you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:

  • Clicking on I Accept:

    • Using CSS_SELECTOR:

      WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#onetrust-accept-btn-handler"))).click()
      
    • Using XPATH:

      WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[@id='onetrust-accept-btn-handler']"))).click()
      
  • Clicking on Annual:

    • Using CSS_SELECTOR:

      WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[data-ptype='Annual']"))).click()
      
    • Using XPATH:

      WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@data-ptype='Annual']"))).click()
      
  • Note: You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352