2

I have to automate a task which involves lots of google searching, which I am doing through selenium and python. After 20 searches google says suspicious activity detected and gives a reCaptcha to prove I am not a robot.
I have tried other ways (like changing profile) but still the same problem.

How to get rid of it?

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • 6
    This captcha is placed there to prevent exactly what you are doing. Bot automating google search IS suspicious activity, because it could be used for page positioning. If you need to perform google searches, use their API https://developers.google.com/custom-search/v1/overview – stasiaks Mar 01 '19 at 08:53

2 Answers2

1

I solved this by rotating a decent pool of proxies with an inner load balancer, switching user agent and use captcha solving APIs where appropriate. Having a good amount of clean IP addresses and using them wisely has the biggest impact so far.

Zoe
  • 27,060
  • 21
  • 118
  • 148
exfriend
  • 11
  • 2
0

Websites can detect your network traffic and identify as a BOT pretty easily. Google have already released 5(five) reCAPTCHA to choose from when creating a new site. While four of them are active and reCAPTCHA v1 being shutdown.

reCAPTCHA version and types

  • reCAPTCHA v3 (verify requests with a score): reCAPTCHA v3 allows you to verify if an interaction is legitimate without any user interaction. It is a pure JavaScript API returning a score, giving you the ability to take action in the context of your site: for instance requiring additional factors of authentication, sending a post to moderation, or throttling bots that may be scraping content.
  • reCAPTCHA v2 - "I'm not a robot" Checkbox: The "I'm not a robot" Checkbox requires the user to click a checkbox indicating the user is not a robot. This will either pass the user immediately (with No CAPTCHA) or challenge them to validate whether or not they are human. This is the simplest option to integrate with and only requires two lines of HTML to render the checkbox.

newCaptchaAnchor

  • reCAPTCHA v2 - Invisible reCAPTCHA badge: The invisible reCAPTCHA badge does not require the user to click on a checkbox, instead it is invoked directly when the user clicks on an existing button on your site or can be invoked via a JavaScript API call. The integration requires a JavaScript callback when reCAPTCHA verification is complete. By default only the most suspicious traffic will be prompted to solve a captcha. To alter this behavior edit your site security preference under advanced settings.

reCaptcha_invisible_badge

  • reCAPTCHA v2 - Android: The reCAPTCHA Android library is part of the Google Play services SafetyNet APIs. This library provides native Android APIs that you can integrate directly into an app. You should set up Google Play services in your app and connect to the GoogleApiClient before invoking the reCAPTCHA API. This will either pass the user through immediately (without a CAPTCHA prompt) or challenge them to validate whether they are human.
  • reCAPTCHA v1: reCAPTCHA v1 has been shut down since March 2018.

Solution

However there are some generic approaches to avoid getting detected while web-scraping:

Outro

See:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352