0

I am trying to automate google search but unfortunately my IP is blocked. After some searches, it seems like using Tor could get me a new IP dynamically. However, after adding the following code block into my existing code, google still blocks my attempts even under the new IP. So I am wondering is there anything wrong with my code?

Code (based on this)

from TorCtl import TorCtl
import socks
import socket
import urllib2

socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)


__originalSocket = socket.socket

def newId():
    ''' Clean circuit switcher

    Restores socket to original system value.
    Calls TOR control socket and closes it
    Replaces system socket with socksified socket
    '''
    socket.socket = __originalSocket
    conn = TorCtl.connect(controlAddr="127.0.0.1", controlPort=9051, passphrase="mypassword")
    TorCtl.Connection.send_signal(conn, "NEWNYM")
    conn.close()
    socket.socket = socks.socksocket

## generate a new ip
newId()

### verify the new ip
print(urllib2.urlopen("http://icanhazip.com/").read())

## run my scrape code
google_scrape()
new error message
<br>Sometimes you may be asked to solve the CAPTCHA if you are using advanced terms that robots are known to use, or sending requests very quickly.
</div>

IP address: 89.234.XX.25X<br>Time: 2017-02-12T05:02:53Z<br>
Community
  • 1
  • 1
TTT
  • 4,354
  • 13
  • 73
  • 123

1 Answers1

3

Google (and many other sites such as "protected" by Cloudflare) filter requests coming via TOR by the IP address of Tor exit nodes. They can do this because the list of IP addresses of Tor exit nodes is public.

Thus changing your identity - which in turn changes your Tor circuit and will likely result in using a different exit node and thus different IP (although the latter two are not guaranteed) - will not work against this block.

For your use case you might consider using VPN instead of Tor, as their IP addresses are less likely to be blocked. Especially if you use non-free VPN.

George Y.
  • 11,307
  • 3
  • 24
  • 25
  • Thanks for your suggestion. But does that mean I need to have multiple VPNs? Looks like if I do not pause my code between quieres, it will be blocked pretty soon... – TTT Feb 12 '17 at 05:36
  • In this case, probably yes. You might also consider pausing your code between queries, or do something else according to Google ToS. – George Y. Feb 12 '17 at 16:33
  • Thanks sir for the suggestions! – TTT Feb 12 '17 at 18:52