2

So I have got some code using the praw API that I would like to run through a proxy. I have found following code under this question and it works for me.


import socks
import socket
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
socket.socket = socks.socksocket
import urllib2

print(urllib2.urlopen("http://www.ifconfig.me/ip").read())

My question is if it will also pipe the praw networking through the proxy? I started looking into the praw code trying to understand a little bit how it works but it is just way too complicated for me and I could not decide whether it will work. Has anyone of you enough experience to tell me how praw does its networking or how the socks proxy exactly works or can just tell me how I could find out?

Thank you very much for helping me out.

pythoniac
  • 21
  • 2

1 Answers1

2

Is this achievable?

Yes. Python uses the Requests library to make HTTP(S) requests. According to this answer and Requests documentation, requests>=2.10.0 supports SOCKS proxies via PySocks.


Version compatibility

As of this answer, prawcore (which praw depends on) requires requests >=2.6.0, <3.0. You likely have some version of requests installed that is at least 2.10.0, but you can check with the following:

$ python3 -c 'import requests; print(requests.__version__)'

If your installed version is less than 2.10.0, upgrade with:

$ python3 -m pip install 'requests >=2.10.0, <3.0'

Proxy configuration

According to the linked answer, we set up a dict of our proxy (using 'https' in place of 'http' because all of PRAW's requests happen over HTTPS):

proxies = {'https': 'socks5://127.0.0.1:9050'}

We then have to pass this to the Session that PRAW uses. We instantiate it using the proxies argument:

import requests
socks_session = requests.Session(proxies=proxies)

PRAW documentation mentions how to use a custom Session:

The requestor_class and requestor_kwargs allow for customization of the requestor Reddit will use. This allows, e.g., easily adding behavior to the requestor or wrapping its Session in a caching layer.

Here is how we pass in our custom Session to PRAW:

reddit = praw.Reddit(client_id='XX',
                     client_secret='XX',
                     user_agent='my_bot by pythoniac',
                     # ... more kwargs ...
                     requestor_kwargs={'session': socks_session})

Any requests that PRAW makes through Requests will go through the SOCKS proxy.


DNS resolution

Note what Requests documentation says about SOCKS proxies when it comes to DNS resolution:

Using the scheme socks5 causes the DNS resolution to happen on the client, rather than on the proxy server. This is in line with curl, which uses the scheme to decide whether to do the DNS resolution on the client or proxy. If you want to resolve the domains on the proxy server, use socks5h as the scheme.

jarhill0
  • 1,559
  • 1
  • 11
  • 19