2

Here is my python code:

import pandas as pd
import pandas_datareader.data as web
import bs4 as bs
import urllib.request as ul

from selenium import webdriver
style.use('ggplot')
driver = webdriver.PhantomJS(executable_path='C:\\Phantomjs\\bin\\phantomjs.exe')
def getBondRate():
    #driver.deleteAllCookies();
    url = "https://www.marketwatch.com/investing/index/tnx?countrycode=xx"  

    driver.get(url)
    driver.implicitly_wait(10)
    html = driver.page_source
    return html
bondRate = getBondRate()
print(bondRate)

Few days back it was reading perfectly fine from Market watch. Now it is returning nothing in Body tag. Is selenium not loading page?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
JAGS8386
  • 33
  • 1
  • 5

2 Answers2

0

Do you require the HTML tags also? If not, you can try retrieving using the body tag. Here's how I would do it using Java.

String src=driver.findElement(By.tagName("body")).getText();
SteroidKing666
  • 553
  • 5
  • 13
0

As per the url https://www.marketwatch.com/investing/index/tnx?countrycode=xx the behavior you are observing is pretty much justified.

I have taken up your code and along with a simple tweak tried to extract the page_source with PhantomJS as well as ChromeDriver. It is observed that when you use any WebDriver variant, the WebDriver fingerprints are geting detected and a Fingerprinting error is raised as follows:

  • Error details:

    Failed to load resource: the server responded with a status of 404 (Not Found)
    kpf.js?url=/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint&token=058cbc6a-f8b8-f175-ca68-8c2e0fd6a4e3:1 Fingerprinting error 
      name: Error 
      message: Error issuing AJAX request (status code: 404) 
      stack: Error: Error issuing AJAX request (status code: 404)
        at XMLHttpRequest.N.a.onreadystatechange (https://www.marketwatch.com/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint/script/kpf.js?url=/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint&token=058cbc6a-f8b8-f175-ca68-8c2e0fd6a4e3:1:1884)
    DevTools failed to parse SourceMap: https://www.marketwatch.com/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint/script/fingerprint.js.map
    
  • DevTools Snapshot:

fingerprintingerror

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Thank you. How do I overcome with this issue if I still need to access data? – JAGS8386 Aug 09 '18 at 15:23
  • @JAGS8386 There are multiple ways. You can compile the _WebDriver_ binary i.e. _chromedriver_ binary with a few tweeks or use a _PROXY_. I have updated the answer and added some more references. – undetected Selenium Aug 09 '18 at 15:25
  • @JAGS8386 were you able to overcome the kpf.js with Selenium or any other tool? – matteo84 May 25 '20 at 18:33