Webscraping with Selenium without browser

Question

I want to use the Python module Selenium to do web-scraping through a jupyter notebook. The jupyter notebook runs in a docker-container without any web-browser. I want to be able to distribute the notebook so that the web-scraping can be duplicated by other users. The notebook runs on a common jupyter lab container, and it is not possible to update the container to include a browser.

I have tried a number of things:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(ChromeDriverManager().install())

And this:

!pip install chromedriver-binary
from selenium import webdriver
import chromedriver_binary  # Adds chromedriver binary to path

driver = webdriver.Chrome('/opt/conda/lib/python3.7/site-packages/chromedriver_binary')

For this last case I have located the binaries using

import chromedriver_binary
print(chromedriver_binary.__file__)

But unfortunately I have not been able to make any of it work.

Which OS is used in docker container? This answer shows how to install selenium webdriver for google colab running on ubuntu: https://stackoverflow.com/questions/51046454/how-can-we-use-selenium-webdriver-in-colab-research-google-com/54077842#54077842 — Alexandra Dudkina, Sep 18 '20 at 11:35

score 1 · Answer 1 · answered Sep 15 '20 at 12:01

1

the chrome driver depends on a local install of chrome - so you'll have to modify the docker image you're using to install chrome first.

answered Sep 15 '20 at 12:01

lscoughlin

2,327
16
23

You are technically correct, but I'm using a containerized instance of jupyter lab, where I cannot modify the docker image. So I'm hoping that I can find a work-around and install the browser afterwards – emil banning Sep 15 '20 at 12:08
A bit late to the party but I had the same issue, so I created a jupyter stack with scraper tools and added to the community stacks https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#community-stacks – GriffoGoes Apr 01 '22 at 01:49

Webscraping with Selenium without browser

1 Answers1

Linked