13

How to route urllib requests through the TOR network?

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
Lobe
  • 415
  • 1
  • 5
  • 8
  • 1
    What have you tried? TOR should be largely transparent to you. Try using urrlib2; post your code and error messages. – S.Lott Apr 02 '09 at 19:59
  • I have no code or error messages - I am asking how to do it. – Lobe Apr 02 '09 at 20:04
  • @Lobe: Tor anonymizes your requests -- it conceals you from the web site. It doesn't do anything to the basic method of making HTTP requests -- that's why there's no documentation. Nothing changes except no you're anonymous. – S.Lott Apr 02 '09 at 20:24

3 Answers3

12

This works for me (using urllib2, haven't tried urllib):

def req(url):
    proxy_support = urllib2.ProxyHandler({"http" : "127.0.0.1:8118"})
    opener = urllib2.build_opener(proxy_support) 
    opener.addheaders = [('User-agent', 'Mozilla/5.0')]
    return opener.open(url).read()

print req('http://google.com')
jahmax
  • 8,181
  • 7
  • 26
  • 25
6

Tor works as a proxy, right? So ask yourself "How do I use proxies in urllib?"

Now, when I look at the docs, first thing I see is

urllib.urlopen(url[, data[, proxies]])

which seems pretty suggestive to me...

dmckee --- ex-moderator kitten
  • 98,632
  • 24
  • 142
  • 234
2

I managed to do an urlib.request for an onion url I found a solution based on this post: Python 3.2 : urllib, SSL and TOR through socket : error with fileno function

here is the modified code:

import socks
import socket

# This function has no DNS resolve
# it need to use the real ip adress to connect instead of www.google.com
def create_connection_fixed_dns_leak(address, timeout=None, source_address=None):
    sock = socks.socksocket()
    sock.connect(address)
    return sock

# MUST BE SET BEFORE IMPORTING URLLIB
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
# patch the socket module
socket.socket = socks.socksocket
socket.create_connection = create_connection_fixed_dns_leak

from urllib import request

if __name__ == "__main__":
    for proxy in request.getproxies():
        print(str(proxy))
    url = 'http://url_of_hidden_service.onion:port'
    req = request.Request(url)
    res = request.urlopen(req)
    print(str(res.read()))
Community
  • 1
  • 1
Mordred
  • 21
  • 1