1

I know the webscraping and I have taken the data from different website and I am using python language and selenium webdriver chrome. But I call a website it is open front page and then I click or go any other page then website restrict me and website know that I am using automated chrome.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Imran Rafiq
  • 308
  • 2
  • 5
  • 15
  • Probably the number of requests per second. Be aware that scrapping is most of the time illegal and tey can ban your ip easily, you should make some sleeps each time you want to scrap – BlueSheepToken Mar 13 '19 at 09:47
  • I have never used Selenium, if it is possible try adding a User-Agent to the request. – Siddharth Dushantha Mar 13 '19 at 09:49
  • I am using sleep but it is not effecting the website. I think website capture automated google chrome request on first time and it do not allow to do any thing – Imran Rafiq Mar 13 '19 at 11:51

2 Answers2

2

This may be because the website uses reCAPTCHA v3, which "allows you to verify if an interaction is legitimate without any user interaction". This means that they can identify if you are not a human without asking you to check the famous "I'm not a robot" box. That box is used in the former version of reCAPTCHA, v2.

Read more about reCAPTCHA here: https://developers.google.com/recaptcha/docs/versions

I don't think it's possible to work around this with Selenium. And, as was already mentioned, web scraping is often illegal.

Carlos
  • 51
  • 7
  • Main problem is how to call from automated chrome that did not capture by website that it is not automated chrome – Imran Rafiq Mar 13 '19 at 11:52
0

These days, websites can detect your program as a BOT pretty easily. Currently Google have 4(four) reCAPTCHA to choose and implement from when creating a new site.

  • reCAPTCHA v3
  • reCAPTCHA v2 ("I'm not a robot" Checkbox)
  • reCAPTCHA v2 (Invisible reCAPTCHA badge)
  • reCAPTCHA v2 (Android)

Solution

However there are some generic approaches to avoid getting detected while web-scraping:

Outro

See:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352