0

I am trying to scrape https://marketchameleon.com/Calendar/Earnings but the website is blocking selenium. It initially loads into the website, but the earnings calendar script does not load. If I try refreshing the page, I get a "Access Denied You don't have permission" error. I have tried changing User Agent and other chrome options, but the same issue persists. Can anyone let me know how to bypass the detection?

import requests
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from fake_useragent import UserAgent

options = Options()
ua = UserAgent()
userAgent = ua.random
options.add_argument(f'user-agent={userAgent}')
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)

driver = webdriver.Chrome(chrome_options=options)
driver.get('https://marketchameleon.com/Calendar/Earnings')
  • Do you need to perform some action like clicking inside the site ? If not, you can just use BeautifulSoup for scraping. It doesn't depend on a chromedriver and probably wouldn't be blocked. – francovici Jul 25 '20 at 18:53
  • I need to switch from Show 25 Entries to Show All Entries to grab all the rows. I tried using Requests and BeautifulSoup earlier but it would just hang on requests.get(page), although if there are other options to grab the entire table, I would be open to it. – SimplySaid Jul 25 '20 at 19:15
  • It's a good place to start. It'd be much simpler without Selenium. – francovici Jul 25 '20 at 19:32
  • Does this answer your question? [Can a website detect when you are using selenium with chromedriver?](https://stackoverflow.com/questions/33225947/can-a-website-detect-when-you-are-using-selenium-with-chromedriver) – Dan-Dev Jul 26 '20 at 03:12
  • Have you read all 19 answers to https://stackoverflow.com/questions/33225947/can-a-website-detect-when-you-are-using-selenium-with-chromedriver – Dan-Dev Jul 26 '20 at 03:13
  • Requests + BS4 might not work because I have to wait for the script tag on the "widget" to load. I tried replacing CDC_ with random characters using a Hex editor, but it didn't work. Also tried adding all the suggested options in the thread, but still running into same issue. I will try injecting JS a little later. Thanks. – SimplySaid Jul 26 '20 at 17:09
  • Does the discussion [How to scrape the Javascript based site https://marketchameleon.com/Calendar/Earnings using Selenium and Python?](https://stackoverflow.com/questions/62353469/how-to-scrape-the-javascript-based-site-https-marketchameleon-com-calendar-ear/62364975#62364975) help you? – undetected Selenium Jul 26 '20 at 21:58
  • That does help but I'm not too sure how to bypass it. I might just look for a manual work around (Downloading CSV or using a macro) since the solutions I have tried thsu far haven't worked and I only need the data on a daily basis. Thanks! – SimplySaid Jul 29 '20 at 01:06

0 Answers0