0

I am trying to scrape data from this website using Selenium.

There are three features in data, "Value", "Net change" and "percent change", including values for net and percentage changes for 1, 3, 6, and 12 months. I want to fetch 1 month's net change and percent change. For that, I need to click on the check boxes and click on the update button.

Now, I performed these actions using selenium's find element by XPath method but for percent change, I needed to use the ActionChains command, as I was getting "Element not clickable error".

When I execute the code, all three features should occur in the downloaded csv. But that's not happening. I am just able to fetch "Value" and "1 Month Net change". If anyone knows, may I know, why the is not getting updated and or how to fix it? Thanks

My code:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains

from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
driver.get("https://beta.bls.gov/dataViewer/view/timeseries/CUUR0000SA0")
soup = BeautifulSoup(driver.page_source, "html.parser", from_encoding='utf-8')
driver.find_element(By.XPATH, '/html/body/div[2]/div/div/div[4]/div/div[1]/form/div[2]/fieldset/div[1]/table/tbody/tr[1]/td[1]/label/input').click()  //1 month net change
element = WebDriverWait(driver, 60).until(EC.element_to_be_clickable((By.XPATH, '// [@id="percent_monthly_changes_div"]/table/tbody/tr[1]/td[1]/label/input')))
ActionChains(driver).move_to_element(element).click().perform()   //1 month percent change
driver.find_element(By.XPATH, '/html/body/div[2]/div/div/div[4]/div/div[1]/form/div[4]/input').click()  //update button
driver.find_element(By.XPATH, '//*[@id="csvclickCU"]').click()   //download csv button
ezkl
  • 3,829
  • 23
  • 39
Asmita
  • 13
  • 6

2 Answers2

0

The website is showing N/A in the column of 1 Month Net change. if you still not getting 1 month % change value you can do

driver.execute_script('document.querySelector("#percent_monthly_changes_div > table > tbody > tr:nth-child(1) > td:nth-child(1) > label > input").click()')

instead of:

element = WebDriverWait(driver, 60).until(EC.element_to_be_clickable((By.XPATH, '// [@id="percent_monthly_changes_div"]/table/tbody/tr[1]/td[1]/label/input')))
ActionChains(driver).move_to_element(element).click().perform()

column of 1 Month Net change

this might not be the optimal solution, but it works fine. and 1 month net change value is not given from the website itself.

Kassab
  • 82
  • 1
  • 7
  • ps: when developing you better not use a headless browser, so you can monitor what's happening on the website. – Kassab Jul 04 '22 at 14:44
  • Thank you so much !! This is really helpful. I was stuck in it for the last three days. – Asmita Jul 05 '22 at 19:47
  • You are welcome. I'd suggest to you to change the script to the one suggested by @undetectedSelenium in the answer down below since it's more optimal. – Kassab Jul 05 '22 at 20:09
  • I tried it but it actually isn't working for me. The feature was not updating. – Asmita Jul 06 '22 at 05:41
0

To click on the elements with text as 1-Month Net Change and 1-Month % Change using ActionChains will be an overhead and you can avoid it easily.

Ideally, you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:

  • Using CSS_SELECTOR:

    driver.get("https://beta.bls.gov/dataViewer/view/timeseries/CUUR0000SA0")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[value='1N']"))).click()
    driver.find_element(By.CSS_SELECTOR, "input[value='1P']").click()
    
  • Using XPATH:

    driver.get("https://beta.bls.gov/dataViewer/view/timeseries/CUUR0000SA0")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@value='1N']"))).click()
    driver.find_element(By.XPATH, "//input[@value='1P']").click()
    
  • Note: You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

Browser Snapshot:

beta_bls_gov

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352