119

I'm trying some stuff out with selenium, and I really want my script to run quickly.

I thought that running my script with headless Chrome would make it faster.

First, is that assumption correct, or does it not matter if I run my script with a headless driver?

I want headless Chrome to work, but somehow it isn't working correctly. I tried different things, and most suggested that it would work as said here in the October update:

How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium?

But when I tried that, I saw weird console output, and it still doesn't seem to work.

Any tips appreciated.

Michael Mintz
  • 9,007
  • 6
  • 31
  • 48
Rhynden
  • 1,297
  • 2
  • 8
  • 8

11 Answers11

206

To run chrome-headless just add --headless via chrome_options.add_argument, e.g.:

from selenium import webdriver 
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
#chrome_options.add_argument("--disable-extensions")
#chrome_options.add_argument("--disable-gpu")
#chrome_options.add_argument("--no-sandbox") # linux only
chrome_options.add_argument("--headless")
# chrome_options.headless = True # also works
driver = webdriver.Chrome(options=chrome_options)
start_url = "https://duckgo.com"
driver.get(start_url)
print(driver.page_source.encode("utf-8"))
# b'<!DOCTYPE html><html xmlns="http://www....
driver.quit()

So my thought is that running it with headless chrome would make my script faster.

Try using chrome options like --disable-extensions or --disable-gpu and benchmark it, but I wouldn't count with substantial improvement.


References: headless-chrome

Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268
  • 1
    @AndroidNoobie the edit suggested by ukashima huksay is one that is implemented if I recall correctly may 2018. It finds its way now for getting rep. ukashima huksay should have mentioned it though. (From Review). – ZF007 Aug 18 '19 at 09:26
  • @ukashima huksay next time you find this chrome change mention it in a comment behind the change as I did a few weeks ago somewhere on a question. See also my previous comment above this one. (From Review). – ZF007 Aug 18 '19 at 09:27
22

Install & run containerized Chrome:

docker pull selenium/standalone-chrome
docker run --rm -d -p 4444:4444 --shm-size=2g selenium/standalone-chrome

Connect using webdriver.Remote:

driver = webdriver.Remote('http://localhost:4444/wd/hub', webdriver.DesiredCapabilities.CHROME)
driver.set_window_size(1280, 1024)
driver.get('https://www.google.com')
joseantgv
  • 1,943
  • 1
  • 26
  • 34
Max Malysh
  • 29,384
  • 19
  • 111
  • 115
  • `from selenium import webdriver` and `driver = webdriver.Remote('http://localhost:4444/wd/hub', webdriver.DesiredCapabilities.CHROME)` – Donn Lee Apr 02 '21 at 19:09
  • 8
    what are the advantages of this over --headless? – Greg Woods Apr 20 '21 at 23:20
  • 1
    Where do you get `DesiredCapabilities` from? I don't see the import... I think you meant to use `webdriver.DesiredCapabilities`? – Cornelius Roemer Nov 30 '21 at 20:11
  • @GregWoods - This is a great solution, many websites (including websites that use 'Cloudfare DNS' to detect robots and crawlers will see the `--headless` flag in chrome, and will prevent you from browsing the website with your software. By using a docker container, you circumvent that `--headless` flag that can cause you to be blocked. – jward01 Jun 23 '22 at 22:00
  • 1
    @GregWoods Great solution, too, when you want to run Selenium tests on a desktop-less server. Before switching to this solution, we had to install a desktop on one of our servers only because otherwise, the Chrome package would not even install. Now we can run Selenium tests on any server that has a Docker environment. – not2savvy Jul 01 '22 at 14:33
9
from time import sleep

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument("--headless")

driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
url = "https://stackoverflow.com/questions/53657215/running-selenium-with-headless-chrome-webdriver"
driver.get(url)

sleep(5)

h1 = driver.find_element_by_xpath("//h1[@itemprop='name']").text
print(h1)

Then I run script on our local machine

➜ python script.py
Running Selenium with Headless Chrome Webdriver

It is working and it is with headless Chrome.

Serhii
  • 1,367
  • 3
  • 13
  • 31
6

If you are using Linux environment, may be you have to add --no-sandbox as well and also specific window size settings. The --no-sandbox flag is no needed on Windows if you set user container properly.

Use --disable-gpu only on Windows. Other platforms no longer require it. The --disable-gpu flag is a temporary work around for a few bugs.

//Headless chrome browser and configure
            WebDriverManager.chromedriver().setup();
            ChromeOptions chromeOptions = new ChromeOptions();
            chromeOptions.addArguments("--no-sandbox");
            chromeOptions.addArguments("--headless");
            chromeOptions.addArguments("disable-gpu");
//          chromeOptions.addArguments("window-size=1400,2100"); // Linux should be activate
            driver = new ChromeDriver(chromeOptions);
Yuri
  • 4,254
  • 1
  • 29
  • 46
Devdun
  • 747
  • 5
  • 14
4

Recently there is an update performed on headless mode of Chrome. The flag --headless is now modified and can be used as below

  • For Chrome version 109 and above, --headless=new flag allows us to explore full functionality Chrome browser in headless mode.
  • For Chrome version 108 and below (till Version 96), --headless=chrome option will provide us the headless chrome browser.

So, let's add

options.add_argument("--headless=new")

for newer version of Chrome in headless mode as mentioned above.

S.Mandal
  • 172
  • 1
  • 10
3

Once you have selenium and web driver installed. Below worked for me with headless Chrome on linux cluster :

from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--disable-extensions")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--no-sandbox")
options.add_experimental_option("prefs",{"download.default_directory":"/databricks/driver"})
driver = webdriver.Chrome(chrome_options=options)
Nikunj Kakadiya
  • 2,689
  • 2
  • 20
  • 35
2

Todo (tested on headless server Debian Linux 9.4):

  1. Do this:

    # install chrome
    curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
    echo "deb [arch=amd64]  http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
    apt-get -y update
    apt-get -y install google-chrome-stable
    
    # install chrome driver
    wget https://chromedriver.storage.googleapis.com/77.0.3865.40/chromedriver_linux64.zip
    unzip chromedriver_linux64.zip
    mv chromedriver /usr/bin/chromedriver
    chown root:root /usr/bin/chromedriver
    chmod +x /usr/bin/chromedriver
    
  2. Install selenium:

    pip install selenium
    

    and run this Python code:

    from selenium import webdriver
    from selenium.webdriver.chrome.options import Options
    options = Options()
    options.add_argument("no-sandbox")
    options.add_argument("headless")
    options.add_argument("start-maximized")
    options.add_argument("window-size=1900,1080"); 
    driver = webdriver.Chrome(chrome_options=options, executable_path="/usr/bin/chromedriver")
    driver.get("https://www.example.com")
    html = driver.page_source
    print(html)
    
Basj
  • 41,386
  • 99
  • 383
  • 673
2
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(executable_path=r"C:\Program 
Files\Google\Chrome\Application\chromedriver.exe", options=chrome_options)

This is ok for me.

syildizeli
  • 51
  • 4
2

As stated by the accepted answer:

options.add_argument("--headless")

These tips might help to speed things up especially for headless:

There are quite a few things you can do in headless that you cant do in non headless

Since you will be using Chrome Headless, I've found adding this reduces the CPU usage by about 20% for me (I found this to be a CPU and memory hog when looking at htop)

--disable-crash-reporter

This will only disable when you are running in headless This might speed things up for you!!!

My settings are currently as follows and I reduce the CPU (but only a marginal time saving) by about 20%:

options.add_argument("--no-sandbox");
options.add_argument("--disable-dev-shm-usage");
options.add_argument("--disable-renderer-backgrounding");
options.add_argument("--disable-background-timer-throttling");
options.add_argument("--disable-backgrounding-occluded-windows");
options.add_argument("--disable-client-side-phishing-detection");
options.add_argument("--disable-crash-reporter");
options.add_argument("--disable-oopr-debug-crash-dump");
options.add_argument("--no-crash-upload");
options.add_argument("--disable-gpu");
options.add_argument("--disable-extensions");
options.add_argument("--disable-low-res-tiling");
options.add_argument("--log-level=3");
options.add_argument("--silent");

I found this to be a pretty good list (full list I think) of command line switches with explanations: https://peter.sh/experiments/chromium-command-line-switches/

Some additional things you can turn off are also mentioned here: https://github.com/GoogleChrome/chrome-launcher/blob/main/docs/chrome-flags-for-tools.md

I hope this helps someone

0

You can run Selenium in headless mode as shown below. *--headless=new is better because--headless uses old headless mode according Headless is Going Away!:

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument("--headless=new") # Here
driver = webdriver.Chrome(options=options)
from selenium import webdriver
from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("--headless=new") # Here
driver = webdriver.Chrome(options=options)
Super Kai - Kazuya Ito
  • 22,221
  • 10
  • 124
  • 129
0

There are different ways of running Chrome in headless environments. (You'll find more details in this answer: https://stackoverflow.com/a/73840130/7058266)

One, the standard headless mode: (Faster than headed mode, but you may experience compatibility issues.)

options.add_argument("--headless")

Then there's the new Chrome headless mode as of Chrome 109: (It runs at the same speed as headed mode, as the two are virtually identical.)

options.add_argument("--headless=new")

(Between Chrome 96 and 108, that new mode used to be --headless=chrome, but it was renamed.)

You can also run regular Chrome in a headless environment if using a headless display, such as Xvfb with a Python program that controls it, such as pyvirtualdisplay. (See https://stackoverflow.com/a/6300672/7058266 and https://stackoverflow.com/a/23447450/7058266)

from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=0, size=(800, 600))
display.start()

driver = webdriver.Chrome()
driver.get('http://www.google.com')
driver.quit()

display.stop()

For more compatibility, you can try combining the above together with new Chrome headless mode:

options.add_argument("--headless=new")
Michael Mintz
  • 9,007
  • 6
  • 31
  • 48