The following 3 files are together in one directory, we save them and run make
from the command line of the directory to build the docker image and launch / run a container:
requirements.txt
PyDrive==1.3.1
selenium==4.6.0
webdriver-manager==3.8.5
google-cloud-bigquery==3.4.0
Dockerfile
# Jupyter Image
FROM jupyter/base-notebook
# Let's define this parameter to install jupyter lab instead of the default juyter notebook command so we don't have to use it when running the container with the option -e
ENV JUPYTER_ENABLE_LAB=yes
# Set User
USER root
# Install chrome
RUN apt-get update && apt-get install -y gnupg2
RUN wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
RUN dpkg -i google-chrome-stable_current_amd64.deb; apt-get -fy install
RUN rm google-chrome-stable_current_amd64.deb
# Back to Jovyan
USER jovyan
# Upgrade Pip, Install Python Packages
RUN pip install --upgrade pip
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
Makefile
run:
docker build --platform linux/amd64 -t myname/docker-jupyter-cbbselenium-notebook .
docker run --rm --name my-jupyter-cbb-selenium \
-p 8889:8889 \
-p 8888:8888 \
myname/docker-jupyter-cbbselenium-notebook
I am on an M1 chip locally so I use --platform linux/amd64
in the build, and I expose both ports 8888 and 8889 because of a port issue I was having...
This successfully builds and runs a container just fine so far, and we can grab the http://127.0.0.1:8888/lab?token=028...
url and drop it in my local chrome to launch a jupyter notebook, we create a notebook and run the following in python:
# Libraries
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
# Headless chrome options
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('window-size=1200x600')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
# this downloads chromedriver and then starts the driver
s=Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=s, options=options)
The last line webdriver.Chrome(service=s, options=options)
is throwing the following error, which I have been unable to debug.
error message
---------------------------------------------------------------------------
WebDriverException Traceback (most recent call last)
Cell In [7], line 1
----> 1 driver = webdriver.Chrome(service=s, options=options)
File /opt/conda/lib/python3.10/site-packages/selenium/webdriver/chrome/webdriver.py:81, in WebDriver.__init__(self, executable_path, port, options, service_args, desired_capabilities, service_log_path, chrome_options, service, keep_alive)
78 if not service:
79 service = Service(executable_path, port, service_args, service_log_path)
---> 81 super().__init__(
82 DesiredCapabilities.CHROME["browserName"],
83 "goog",
84 port,
85 options,
86 service_args,
87 desired_capabilities,
88 service_log_path,
89 service,
90 keep_alive,
91 )
File /opt/conda/lib/python3.10/site-packages/selenium/webdriver/chromium/webdriver.py:106, in ChromiumDriver.__init__(self, browser_name, vendor_prefix, port, options, service_args, desired_capabilities, service_log_path, service, keep_alive)
103 self.service.start()
105 try:
--> 106 super().__init__(
107 command_executor=ChromiumRemoteConnection(
108 remote_server_addr=self.service.service_url,
109 browser_name=browser_name,
110 vendor_prefix=vendor_prefix,
111 keep_alive=keep_alive,
112 ignore_proxy=_ignore_proxy,
113 ),
114 options=options,
115 )
116 except Exception:
117 self.quit()
File /opt/conda/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py:288, in WebDriver.__init__(self, command_executor, desired_capabilities, browser_profile, proxy, keep_alive, file_detector, options)
286 self._authenticator_id = None
287 self.start_client()
--> 288 self.start_session(capabilities, browser_profile)
File /opt/conda/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py:381, in WebDriver.start_session(self, capabilities, browser_profile)
379 w3c_caps = _make_w3c_caps(capabilities)
380 parameters = {"capabilities": w3c_caps}
--> 381 response = self.execute(Command.NEW_SESSION, parameters)
382 if "sessionId" not in response:
383 response = response["value"]
File /opt/conda/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py:444, in WebDriver.execute(self, driver_command, params)
442 response = self.command_executor.execute(driver_command, params)
443 if response:
--> 444 self.error_handler.check_response(response)
445 response["value"] = self._unwrap_value(response.get("value", None))
446 return response
File /opt/conda/lib/python3.10/site-packages/selenium/webdriver/remote/errorhandler.py:249, in ErrorHandler.check_response(self, response)
247 alert_text = value["alert"].get("text")
248 raise exception_class(message, screen, stacktrace, alert_text) # type: ignore[call-arg] # mypy is not smart enough here
--> 249 raise exception_class(message, screen, stacktrace)
WebDriverException: Message: unknown error: Chrome failed to start: crashed.
(chrome not reachable)
(The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Stacktrace:
#0 0x004000669563 <unknown>
#1 0x004000428667 <unknown>
#2 0x004000450bb7 <unknown>
#3 0x00400044cd90 <unknown>
#4 0x00400048d6b7 <unknown>
#5 0x00400048d05f <unknown>
#6 0x004000484ef3 <unknown>
#7 0x00400045849e <unknown>
#8 0x0040004595ae <unknown>
#9 0x0040006b8fde <unknown>
#10 0x0040006bc4c4 <unknown>
#11 0x00400069f78e <unknown>
#12 0x0040006bd393 <unknown>
#13 0x004000692665 <unknown>
#14 0x0040006de108 <unknown>
#15 0x0040006de296 <unknown>
#16 0x0040006f9183 <unknown>
#17 0x004002ad3b43 <unknown>
Does anyone else run into this same error? Any recommendations to our Dockerfile, Makefile, or python code that will help us to get selenium running in our Jupyter notebook, to prevent chrome and chromedriver from crashing?
As an aside, any recommendations on how to edit/shorten this question? I wanted to create a reproducible example, however this is a lot to paste into a stackoverflow post.