This is quit late to answer this, but still this may help somebody in future.
I also had this question in my mind and I started analysing it. I went through several websites, blogs and research papers and found how it works internally.
So below are the things that I understood from captcha implementation.
- The
data-sitekey
is associated with the website and before loading the captcha, google verifies if this key is coming from associated domain by verifying document.location.hostname
.
- When user solves reCaptcha, it generates
g-recaptcha-response
token which is nothing but the captcha solution based on your browser history, google.com cookies and other browser data.
- This token is then validated by backend server by calling Google API and passing
shared secret key
between Google and your website.
How these captcha solver services works
- Expect
data-sitekey
and website-url
from user.
- Create a html page which will have reCaptcha in it with user provided
data-sitekey
.
- Update the
hosts
file by adding an entry of the user provided website-url
and point it to 127.0.0.1
- Open this html page on any web-server installed on local machine and try to access the URL using user provided
website-url
as it is pointing to 127.0.0.1
. This way, google will consider the request is coming from valid website and it will generate the reCaptcha.
- Once this reCaptcha is solved, the
g-recaptcha-token
is generated and is valid for ~120 seconds, this token will then given back to user for further steps.
- User have to insert this token inside a
text-area
which has an id of g-recaptcha-response
and then submit the page.
References
I have explained this working in my youtube video Selenium automation of a website having google recaptcha .
The source doesn't exists on github because I deleted my github account. If I can recover the source code, I will add it in my gitlab repository NiRRaNjAN RauT · GitLab.
Research paper I’m not a human: Breaking the Google reCAPTCHA
Based on this knowledge, I have build my own captcha solver service Fast Captcha Solver in affordable price.