1

I'm trying to use Selenium to parse a local HTML file called output.html.

In the Python interpreter, I can do my imports, create a webdriver.Chrome driver object and GET my local file just fine.

I get the error when I try to find anything using the driver's functions.

Code:

>>> from selenium import webdriver
>>> from selenium.webdriver.chrome.options import Options
>>> 
>>> chrome_options = Options()
>>> chrome_options.binary_location = '/usr/bin/google-chrome'
>>> chrome_options.add_argument('--headless')
>>> chrome_options.add_argument('--no-sandbox')
>>> chrome_options.add_argument('--disable-dev-shm-usage')
>>> 
>>> driver = webdriver.Chrome(chrome_options=chrome_options)
>>> 
>>> driver.get('file:output.html')
>>> 
>>> # no error up to here
>>> 
>>> driver.name  # runs ok
>>> driver.orientation  # runs ok
>>>
>>> driver.page_source  # error!
>>> driver.find_element_by_name('p_system')  # error!

I am baffled as to the reason for the error. Every page I find on Google suggests that the chromedriver and/or Google Chrome binary is in the wrong place/not findable by Selenium, yet this can't be the case as I can use GET with the driver with success (with the local HTML file) and can run the same code on websites like https://www.python.org.

Error Traceback:

selenium.common.exceptions.WebDriverException: Message: chrome not reachable
  (Session info: headless chrome=74.0.3729.169)
  (Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Linux 4.4.0-17763-Microsoft x86_64

Duplicates:

While it's easy to mark questions as duplicates and move on, it's much better to review the questions to at least check if there are differences between them.

The key difference between other Stack Overflow questions is this one works for external websites, but doesn't with local files. The other ones don't work, at all, and changing versions fixes the issue.

As shown in the error traceback, the chromedriver version and the headless chrome version are both 74 and should be compatible according to this site.

The Selenium webdriver will work as intended up until you call a certain function, then it will throw the error.

Community
  • 1
  • 1
ChumiestBucket
  • 868
  • 4
  • 22
  • 51

1 Answers1

0

Try with complete path to the file as shown in the below example.

url = r"file:///C:/Users/xxxx/Desktop/delte.html"
driver.get(url)
supputuri
  • 13,644
  • 2
  • 21
  • 39
  • putting the complete path actually makes the `driver.get` fail, unlike just putting the filename. I'm using linux so the URL string would be `file:/c/Users/user/path/to/the/file.html` – ChumiestBucket Jun 04 '19 at 23:36