0

I am trying to scrape the tables from the below dynamic webpage. I am using the below code to find the data in tables (they are under tag name tr). But I am getting empty list as output. Is there anything that I am missing here?

https://www.taipower.com.tw/tc/page.aspx?mid=206&cid=406&cchk=b6134cc6-838c-4bb9-b77a-0b0094afd49d

from selenium import webdriver
chrome_path = r"C:\Users\upko\Downloads\My Projects\Ibrahim Projects\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://www.taipower.com.tw/tc/page.aspx?mid=206&cid=406&cchk=b6134cc6-838c-4bb9-b77a-0b0094afd49d")
driver.find_elements_by_tag_name('tr')

Please find the inspect code of webpage screenshot below

user438383
  • 5,716
  • 8
  • 28
  • 43

2 Answers2

1

Website have iframes, you need switch into desired iframe to access data. Didnt tested code, but should work

iframe = driver.find_element_by_xpath("//iframe[@id='IframeId']")
driver.switch_to_frame(iframe)

#Now you can get data
trs = driver.find_elements_by_tag_name('tr')
Wonka
  • 1,548
  • 1
  • 13
  • 20
0

The desired elements are within an <iframe> so you have to:

  • Induce WebDriverWait for the desired frame to be available and switch to it.

  • Induce WebDriverWait for the desired visibility_of_all_elements_located.

  • You can use either of the following Locator Strategies:

  • Using XPATH:

    driver.get("https://www.taipower.com.tw/tc/page.aspx?mid=206&cid=406&cchk=b6134cc6-838c-4bb9-b77a-0b0094afd49d")
    WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"//iframe[@id='IframeId']")))
    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='container-fluid']//div[@class='span6']/strong")))])
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Console Output:

    ['核能(Nuclear)', '燃煤(Coal)', '汽電共生(Co-Gen)', '民營電廠-燃煤(IPP-Coal)', '燃氣(LNG)', '民營電廠-燃氣(IPP-LNG)', ' 燃油(Oil)', '輕油(Diesel)', '水力(Hydro)', '風力(Wind)', '太陽能(Solar)', '抽蓄發電(Pumping Gen)']
    

Reference

You can find a couple of relevant discussions in:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Hi, thank you for the solution. I wanted to get the data in the tables inside the "tbody" and I tried giving the Xpath (instead of span6 in your code, I gave row and added /tr at the end instead of /strong), but I am getting timeout error. – Yaswanth Maram Apr 11 '22 at 09:18