Selenium/Chromedriver/Chromium(86) issues AWS Lambda

Question

I've been dealing with this issue for the past week and can't get my head around it so I decided to ask for help. I'm trying to run Selenium in AWS Lambda using a Chromium 86 build. The error message I'm keep getting is the following:

{
  "errorMessage": "Message: unknown error: Chrome failed to start: exited abnormally.\n  (chrome not reachable)\n  (The process started from chrome location /opt/bin/chromium is no longer running, so ChromeDriver is assuming that Chrome has crashed.)\n",
  "errorType": "WebDriverException"
}

Here's my build:

Selenium 3.14
Chromium 86.0.4240.0 (https://github.com/vittorio-nardone/selenium-chromium-lambda/blob/master/chromium.zip) which is forked from (https://github.com/puppeteer/puppeteer)
Chromedriver 86.0.4240.22.0 (https://chromedriver.storage.googleapis.com/index.html?path=86.0.4240.22/)

Here's my code:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
    chrome_options = webdriver.ChromeOptions()
#   chrome_options.add_argument('--no-sandbox')
    chrome_options.add_argument('--headless')
    chrome_options.add_argument("start-maximized")
    chrome_options.add_argument("disable-infobars")
    chrome_options.add_argument('--disable-gpu')
    chrome_options.add_argument('--disable-dev-shm-usage')
    chrome_options.add_argument('--window-size=1024x768')
    chrome_options.add_argument('--user-data-dir=/tmp/user-data')
    chrome_options.add_argument('--profile-directory=/tmp')
    chrome_options.add_argument('--hide-scrollbars')
    chrome_options.add_argument('--enable-logging')
    chrome_options.add_argument('--log-level=0')
    chrome_options.add_argument('--v=99')
#   chrome_options.add_argument('--single-process')
    chrome_options.add_argument('--data-path=/tmp/data-path')
    chrome_options.add_argument('--ignore-certificate-errors')
    chrome_options.add_argument('--homedir=/tmp')
    chrome_options.add_argument('--disk-cache-dir=/tmp/cache-dir')
    chrome_options.add_argument('user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.3163.100 Safari/537.36')
    chrome_options.add_argument('--remote-debugging-port=9222')
    chrome_options.binary_location = "/opt/bin/chromium"

    driver = webdriver.Chrome(executable_path="/opt/bin/chromedriver",options=chrome_options)
    driver.get('https://www.google.com/')

The things I have tried so far:

Tried various runtimes Python 3.6, 3.7, 3.8 no success
Tried with and without Lambda layers. When trying with Lambda layer by folder structure is relatively simple:

.
├── bin
│   ├── chromedriver (binary)
│   └── chromium (binary)
└── python
    ├── selenium
    ├── selenium-3.14.0.dist-info
    ├── urllib3
    └── urllib3-1.26.7.dist-info

Gone through majority of the comments here in SO where similar issues have been discussed examples:

Chrome Driver and Chromium Binaries are not working on aws lambda

WebDriverException: Message: unknown error: Chrome failed to start: crashed error using ChromeDriver Chrome through Selenium Python on Amazon Linux ..etc

Tried almost all combinations of the arguments that I'm passing to the chromedriver like w/ & w/o --disable-dev-shm-usagem, w/ & w/o --disable-gpu etc.

The only thing I noticed is if I play with certain arguments sometimes it throws the selenium.common.exceptions.WebDriverException: Message: unknown error: unable to discover open window in chrome error instead of the Chrome failed to start: exited abnormally one. As a last idea I have I was thinking of compiling my own Chromium 86 build. Has there been anyone who managed to get build 86 or higher running on AWS Lambda?

The chrome binary is located either in a separate Layer which is attached to the Lambda function then its under /opt/bin/chromium or if I am not using any Layers then its under the function itself. — vboxer00, Dec 30 '21 at 15:29

score 3 · Answer 1 · answered Jan 02 '22 at 01:00

UPDATE 1/2/2022

I pretty much spent the last couple of days trying to figure out what could be the problem with my entire setup. Is it the code? The way I use lambda/layers? Binaries? Runtime env? Too many moving parts and I didn't want to fallback to Chromium 6x (that was my last working setup) as that's very ancient and certain features that I needed were not present..like features of the Chrome DevTools Protocol.

Then I stumbled across this repository which talks about how to utilise Amazon ECS with Lambda:

https://github.com/umihico/docker-selenium-lambda

Basically in a couple of minutes I was able to setup my container image linked to Lambda and it's running:

Python 3.9.8
Chromium 96.0.4664.0
Chromedriver 96.0.4664.45
Selenium 4.1.0

Then I ported over my function code and with a couple of changes I managed to get it working, finally! Here are my workings args:

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--disable-dev-tools')
chrome_options.add_argument('--remote-debugging-port=9222')
chrome_options.add_argument('--window-size=1280x1696')
chrome_options.add_argument('--user-data-dir=/tmp/chrome-user-data')
chrome_options.add_argument('--single-process')
chrome_options.add_argument("--no-zygote")
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.binary_location = "/opt/chrome/chrome"

driver = webdriver.Chrome
driver = webdriver.Chrome("/opt/chromedriver",options=chrome_options)
driver.get('https://www.google.com/')

The main difference between this setup and a pure Lambda one that with this you utilise ECS (container based) images and you are not running headless-chrome or serverless-chrome but you are running your daemon from chrome snapshots.

https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html

So chrome or chromuim needs to be installed for selenium to correctly work...? — deostroll, Mar 22 '22 at 16:22
Specifically, for this Lambda deployment I'm using a Chrome (Linux) snapshot from: https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Linux_x64/ — vboxer00, Mar 23 '22 at 11:36
vboxer - i am having the exact issue you were having. I need some help to get this to work on Ec2 - can you help? — Optional, May 03 '22 at 03:57
You want this to run on an EC2? If so, you just need to have python installed on it and create your webscraper.py code. You could do set it to run on a given schedule by using cron job. — vboxer00, May 08 '22 at 16:17

Selenium/Chromedriver/Chromium(86) issues AWS Lambda

1 Answers1

UPDATE 1/2/2022