1

My Dockerfile for the Lambda Container

  • when I execute these steps on an EC2 server, the script runs fine
  • I know that Lambda has a read only file system... so the environment differs in that regard
# This definitely works
FROM public.ecr.aws/lambda/python:3.9

#   Copy function code
COPY . ${{LAMBDA_TASK_ROOT}}
            
#   Install the function's dependencies using file requirements.txt
#   from your project folder.
            
COPY requirements.txt  .
RUN  pip3 install -r requirements.txt --target "${{LAMBDA_TASK_ROOT}}"

# Installing Firefox and Gecko Driver (problem should be here)
RUN yum -y install amazon-linux-extras
RUN yum -y install Xvfb
RUN PYTHON=python2 amazon-linux-extras install firefox -y
RUN yum -y install wget
RUN wget https://github.com/mozilla/geckodriver/releases/download/v0.32.2/geckodriver-v0.32.2-linux64.tar.gz
RUN yum -y install tar 
RUN tar -xf geckodriver-v0.32.2-linux64.tar.gz
RUN mv geckodriver /usr/local/bin/
RUN export MOZ_HEADLESS=1
RUN export HOME=/tmp/profile

# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "{handler}" ]"""

My script (setting up the selenium firefox driver)

  • using use_portal_driver results in the error
import os
import contextlib
import shutil
from selenium.webdriver.firefox.options import Options
from selenium import webdriver

@contextlib.contextmanager
def use_portal_driver():
    if not os.path.exists("/tmp/profile"):
        os.makedirs("/tmp/profile")

    options = Options()
    options.set_preference("pdfjs.disabled", True)
    options.set_preference("browser.download.folderList", 2)
    options.set_preference("browser.download.manager.useWindow", False)
    if not os.path.exists("/tmp/portal_downloads"):
        os.makedirs("/tmp/portal_downloads")
    options.set_preference("browser.download.dir", os.path.abspath("/tmp/portal_downloads"))
    options.set_preference("browser.helperApps.neverAsk.saveToDisk",
                           "application/pdf, application/force-download")
    options.add_argument("--headless")
    options.add_argument('--disable-gpu')
    options.add_argument("--profile /tmp/profile")

    driver = webdriver.Firefox(options=options, log_path='/tmp/firefox.log', service_log_path="/tmp/firefox_service.log")  # error appears here
    driver.implicitly_wait(20)
    yield driver
    driver.quit()

The error I get

\":\"Traceback (most recent call last):\\n File \\\"/var/task/microservices/rechnungspruefung/lambda_functions/rechnungen_fuer_apotheken_filiale_automatisiert_hochladen.py\\\", line 159, in handler\\n download = portal.aktuelle_monatsrechnung_herunterladen()\\n File \\\"/var/task/microservices/rechnungspruefung/automatisiertes_hochladen/noweda.py\\\", line 21, in aktuelle_monatsrechnung_herunterladen\\n with use_portal_driver() as driver:\\n File \\\"/var/lang/lib/python3.9/contextlib.py\\\", line 119, in __enter__\\n return next(self.gen)\\n File \\\"/var/task/microservices/rechnungspruefung/automatisiertes_hochladen/portal_driver.py\\\", line 50, in use_portal_driver\\n driver = webdriver.Firefox(options=options, log_path='/tmp/firefox.log', service_log_path=\\\"/tmp/firefox_service.log\\\")\\n File \\\"/var/task/selenium/webdriver/firefox/webdriver.py\\\", line 197, in __init__\\n super().__init__(command_executor=executor, options=options, keep_alive=True)\\n File \\\"/var/task/selenium/webdriver/remote/webdriver.py\\\", line 288, in __init__\\n self.start_session(capabilities, browser_profile)\\n File \\\"/var/task/selenium/webdriver/remote/webdriver.py\\\", line 381, in start_session\\n response = self.execute(Command.NEW_SESSION, parameters)\\n File \\\"/var/task/selenium/webdriver/remote/webdriver.py\\\", line 444, in execute\\n self.error_handler.check_response(response)\\n File \\\"/var/task/selenium/webdriver/remote/errorhandler.py\\\", line 249, in check_response\\n raise exception_class(message, screen, stacktrace)\\nselenium.common.exceptions.TimeoutException: Message: Failed to read marionette port\\n\",\"

Do I need to do something else in my Dockerfile or in my script (setting up the selenium firefox driver)?

  • I set the firefox profile to a subdirectory in the tmp directory, because I know that everything else in the Lambda environment is read only
  • I expect the script to run fine and not throw this error
  • If everything is read only, you'll have a problem because the browser won't be able to create the file that tells the webdriver what port/sessionID to use. – pcalkins Mar 08 '23 at 19:57
  • Can I place that file in the tmp directory? – Julius Krahn Mar 09 '23 at 05:09
  • The browser will choose where to place that file... it will also create a temporary folder for the Selenium session and, by default, for the temporary profile. They are cleaned up when the driver quits. Since Selenium sends the command to launch the browser in dev-mode you'd have to modify Selenium and/or webdriver to launch it with different args. (not sure what those would be... or if there are command line args for that) – pcalkins Mar 09 '23 at 17:51
  • Check this thread, there are a number of answers here: https://stackoverflow.com/questions/72374955/failed-to-read-marionette-port-when-running-selenium-geckodriver-firefox-a – pcalkins Mar 09 '23 at 18:05

0 Answers0