1

I want to build a web page crawler using python and host it on AWS Sagemaker. The reason behind using the Selenium package is that my target webpage blocks access to bots, and I have tried requests, beautifulsoup, and scrapy, and none of them work.

I tried downloading the related webdriver (e.g. chromedriver file) and uploading it to the notebook's executable path, using os.get_exec_path() and still when I run:

from selenium import webdriver
chrome = webdriver.Chrome('/opt/conda/bin/chromedriver')

I get the following error:

WebDriverException: Message: 'chromedriver' executable may have wrong permissions. Please see https://chromedriver.chromium.org/home

I have tried different solutions like headless mode and etc. and I have already tried similar Q&As in the stackoverflow. It seems like it is impossible to do that on the cloud. Just posting to get ideas if anyone has encountered the same situation.

armiro
  • 93
  • 1
  • 3
  • 14
  • Headless doesn't change file permissions, just how you run it. Have you tried the solutions in 3 duplicates marked? They actually mention permissions issues, not headless. – h4z3 Apr 05 '22 at 08:38

0 Answers0