5

I have been using Selenium and python to web scrape for a couple of weeks now. It has been working fairly good. Been running on a macOS and windows 7. However all the sudden the headless web driver has stopped working. I have been using chromedriver with the following settings:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("--headless")
options.add_argument('--no-sandbox')
options.add_argument('--disable-gpu')
chrome_options.add_argument("--window-size=1920x1080")
driver = webdriver.Chrome(chrome_options=options)

driver.get('url')

Initially I had to add the window, gpu and sandbox arguments to get it work and it did work up until now. However, when running the script now it gets stuck at driver.get('url'). It doesn't produce an error or anything just seems to run indefinitely. When I run without headless and simply run:

from selenium import webdriver
driver = webdriver.Chrome()

driver.get('url')

it works exactly as intended. This problem is also isolated to my windows machine. Where do I start?

Frankster
  • 653
  • 7
  • 26
  • Did you change something in Chrome (updates, configuration, etc) or other parameters in meantime? Because the last example shows that headless is disabled, but also no sandbox and gpu. Did you try different combinations of these parameters? – Nomce Mar 21 '19 at 08:24
  • The only thing that comes to mind is that i deleted several GB of folders that chrome driver had created in tmp folder on windows. There were thousands of "scoped_dir" folders that had been created and was eating up my storage. Could that be something? Otherwise I did try a combination of the arguments without any luck. – Frankster Mar 21 '19 at 08:30

2 Answers2

2

Solved

For some reason the proxy setting was slowing it down. Therefore it got solved by adding:

options.add_argument(f'--proxy-server={None}')
Frankster
  • 653
  • 7
  • 26
2

I had exactly the same problem. It appeared randomly after the script has run fine for weeks. OP has led me to the right direction, but his solution doesnt worked for me. I had to add:

chrome_options.add_argument("--no-proxy-server")
chrome_options.add_argument("--proxy-server='direct://'");
chrome_options.add_argument("--proxy-bypass-list=*");

My complete code:

chrome_options = Options()  
chrome_options.add_argument("--headless")  
chrome_options.add_argument("--start-maximized")
chrome_options.add_argument("--start-fullscreen")
chrome_options.add_argument("--no-proxy-server")
chrome_options.add_argument("--proxy-server='direct://'");
chrome_options.add_argument("--proxy-bypass-list=*");
chrome_options.binary_location = "C:\Program Files (x86)\Google\Chrome Dev\\Application\chrome.exe" 

browser = webdriver.Chrome(options=chrome_options) 
browser.set_window_size(2000, 1080)

please see also:

Headless chrome driver too slow and: Chrome webdriver produces timeout in selenium

Vidarrus
  • 175
  • 4
  • 18