0

While scraping this page:

https://www.hkex.com.hk/Products/Listed-Derivatives/Equity-Index/Hang-Seng-Index-(HSI)/Hang-Seng-Index-Futures?sc_lang=en#&product=HSI

in google chrome key F12, I see the xpath

  t//*[@id="equity_future"]

has a thead and a tbody. The tbody is available.

However, inside python3 debugger, with

wdriver = webdriver.PhantomJS()
wdriver.get(url)
soup = BeautifulSoup(wdriver.page_source,"lxml")

I do see the thead children but the tbody appears empty

<tbody>
</tbody>

Any ideas?

MMM
  • 910
  • 1
  • 9
  • 25

1 Answers1

0

Using only Selenium if you extract the page_source you can find all the <tbody> tags as follows:

  • Code Block:

    driver = webdriver.PhantomJS(executable_path=r'C:\WebDrivers\phantomjs.exe')
    driver.get("https://www.hkex.com.hk/Products/Listed-Derivatives/Equity-Index/Hang-Seng-Index-(HSI)/Hang-Seng-Index-Futures?sc_lang=en#&product=HSI")
    print(driver.page_source)
    
  • Console Output Snippet 1:

    <tbody>
    <tr>
        <td class="ls">Last Traded</td>
        <td class="vo">Volume</td>
        <td class="oi">Prev.Day Open Interest</td>
    </tr>
    </tbody>
    
  • Console Output Snippet 2:

    <tbody>
    <tr>
        <td class="se">Prev.Day Settlement Price</td>
        <td class="vo">Volume</td>
        <td class="oi">Prev.Day Open Interest</td>
    </tr>
    </tbody>
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352