I'd like to run a selenium script on AWS Lamda via a Docker container.
I'm using AWS EC2 to build and then locally test the container via AWS Lambda RIE. Once successfully tested, the container is registered on ECR so to feed AWS Lambda.
Despite RIE local test on EC2 always succeed, I can't manage to make Lambda working right. Lambda testing is currently always failing with the following error message:
{
"errorMessage": "Message: session not created\nfrom tab crashed\n (Session info: headless chrome=93.0.4577.63)\n",
"errorType": "SessionNotCreatedException",
"stackTrace": [
" File \"/var/task/app.py\", line 32, in handler\n driver = webdriver.Chrome(\n",
" File \"/var/task/selenium/webdriver/chrome/webdriver.py\", line 76, in __init__\n RemoteWebDriver.__init__(\n",
" File \"/var/task/selenium/webdriver/remote/webdriver.py\", line 157, in __init__\n self.start_session(capabilities, browser_profile)\n",
" File \"/var/task/selenium/webdriver/remote/webdriver.py\", line 252, in start_session\n response = self.execute(Command.NEW_SESSION, parameters)\n",
" File \"/var/task/selenium/webdriver/remote/webdriver.py\", line 321, in execute\n self.error_handler.check_response(response)\n",
" File \"/var/task/selenium/webdriver/remote/errorhandler.py\", line 242, in check_response\n raise exception_class(message, screen, stacktrace)\n"
]
}
Here you can find all the code I'm actually using:
Dockerfile
FROM public.ecr.aws/lambda/python:3.8
#Download and install Chrome
RUN curl https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm > ./google-chrome-stable_current_x86_64.rpm
RUN yum install -y ./google-chrome-stable_current_x86_64.rpm
RUN rm ./google-chrome-stable_current_x86_64.rpm
#Download and install chromedriver
RUN yum install -y unzip
RUN curl http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip > /tmp/chromedriver.zip
RUN unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
RUN rm /tmp/chromedriver.zip
RUN yum remove -y unzip
#Upgrade pip and install python dependences
RUN pip3 install --upgrade pip
RUN pip3 install selenium --target "${LAMBDA_TASK_ROOT}"
#Copy app.py
COPY app.py ${LAMBDA_TASK_ROOT}
CMD ["app.handler"]
app.py
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
def handler(event, context):
chrome_options = Options()
chrome_options.add_argument("--allow-running-insecure-content")
chrome_options.add_argument("--ignore-certificate-errors")
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--disable-dev-tools")
chrome_options.add_argument("--no-zygote")
chrome_options.add_argument("--v=99")
chrome_options.add_argument("--single-process")
chrome_options.binary_location = '/usr/bin/google-chrome-stable'
capabilities = webdriver.DesiredCapabilities().CHROME
capabilities['acceptSslCerts'] = True
capabilities['acceptInsecureCerts'] = True
driver = webdriver.Chrome(
executable_path='/usr/local/bin/chromedriver',
options=chrome_options,
desired_capabilities=capabilities)
if driver:
response = {
"statusCode": 200,
"body": json.dumps("Selenium Driver Initiated")
}
return response
Local Container Testing with RIE
$ docker run -p 9000:8080 aws-scraper
results in > time="2021-09-03T15:24:13.269" level=info msg="exec '/var/runtime/bootstrap' (cwd=/var/task, handler=)"
$ curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{}'
results in > {"statusCode": 200, "body": "\"Selenium Driver Initiated\""}[
I really can't figure it out. I also tried to follow Selenium works on AWS EC2 but not on AWS Lambda, but to no avail.
Any help would be more than welcomed. Thank you in advance.