I want to build a web page crawler using python and host it on AWS Sagemaker. The reason behind using the Selenium package is that my target webpage blocks access to bots, and I have tried requests
, beautifulsoup
, and scrapy
, and none of them work.
I tried downloading the related webdriver (e.g. chromedriver
file) and uploading it to the notebook's executable path, using os.get_exec_path()
and still when I run:
from selenium import webdriver
chrome = webdriver.Chrome('/opt/conda/bin/chromedriver')
I get the following error:
WebDriverException: Message: 'chromedriver' executable may have wrong permissions. Please see https://chromedriver.chromium.org/home
I have tried different solutions like headless mode and etc. and I have already tried similar Q&As in the stackoverflow. It seems like it is impossible to do that on the cloud. Just posting to get ideas if anyone has encountered the same situation.