35

I was using Mechanize module a while ago, and now try to use Requests module.
(Python mechanize doesn't work when HTTPS and Proxy Authentication required)

I have to go through proxy-server when I access the Internet.
The proxy-server requires authentication. I wrote the following codes.

import requests
from requests.auth import HTTPProxyAuth

proxies = {"http":"192.168.20.130:8080"}
auth = HTTPProxyAuth("username", "password")

r = requests.get("http://www.google.co.jp/", proxies=proxies, auth=auth)

The above codes work well when proxy-server requires basic authentication.
Now I want to know what I have to do when proxy-server requires digest authentication.
HTTPProxyAuth seems not to be effective in digest authentication (r.status_code returns 407).

Community
  • 1
  • 1
yutaka2487
  • 1,926
  • 2
  • 13
  • 12

9 Answers9

35

No need to implement your own! in most cases

Requests has built in support for proxies, for basic authentication:

proxies = { 'https' : 'https://user:password@proxyip:port' } 
r = requests.get('https://url', proxies=proxies) 

see more on the docs

Or in case you need digest authentication HTTPDigestAuth may help.
Or you might need try to extend it like yutaka2487 did bellow.

Note: must use ip of proxy server not its name!

imbr
  • 6,226
  • 4
  • 53
  • 65
  • 3
    It only works for Basic authentication, not Digest authentication which is what the OP asked for. – Tey' Oct 21 '19 at 04:34
  • 2
    It will not work because `HTTPDigestAuth` only supports authentication with the final website server (`WWW-Authenticate`/`Authorization` headers, 401 status), not the proxy server (`Proxy-Authenticate`/`Proxy-Authorization` headers, 407 status). You need a solution similar to the one given by @yutaka2487 for that, but it only works for contacting a HTTP server through proxy, not a HTTPS server, because requests/urllib3 backend does not report proxy error when tunneling HTTPS connections, so Digest auth cannot work properly. – Tey' Oct 22 '19 at 03:08
  • @Tey' just consider `or you might need try to extend it like yutaka2487 did bellow` – imbr Feb 19 '21 at 21:39
22

I wrote the class that can be used in proxy authentication (based on digest auth).
I borrowed almost all codes from requests.auth.HTTPDigestAuth.

import requests
import requests.auth

class HTTPProxyDigestAuth(requests.auth.HTTPDigestAuth):
    def handle_407(self, r):
        """Takes the given response and tries digest-auth, if needed."""

        num_407_calls = r.request.hooks['response'].count(self.handle_407)

        s_auth = r.headers.get('Proxy-authenticate', '')

        if 'digest' in s_auth.lower() and num_407_calls < 2:

            self.chal = requests.auth.parse_dict_header(s_auth.replace('Digest ', ''))

            # Consume content and release the original connection
            # to allow our new request to reuse the same one.
            r.content
            r.raw.release_conn()

            r.request.headers['Authorization'] = self.build_digest_header(r.request.method, r.request.url)
            r.request.send(anyway=True)
            _r = r.request.response
            _r.history.append(r)

            return _r

        return r

    def __call__(self, r):
        if self.last_nonce:
            r.headers['Proxy-Authorization'] = self.build_digest_header(r.method, r.url)
        r.register_hook('response', self.handle_407)
        return r

Usage:

proxies = {
    "http" :"192.168.20.130:8080",
    "https":"192.168.20.130:8080",
}
auth = HTTPProxyDigestAuth("username", "password")

# HTTP
r = requests.get("http://www.google.co.jp/", proxies=proxies, auth=auth)
r.status_code # 200 OK

# HTTPS
r = requests.get("https://www.google.co.jp/", proxies=proxies, auth=auth)
r.status_code # 200 OK
yutaka2487
  • 1,926
  • 2
  • 13
  • 12
  • 10
    I get the error: 'HTTPProxyDigestAuth' object has no attribute 'last_nonce'. When I try and use your class. I'll look into it. – MattCochrane Oct 21 '15 at 22:14
  • 6
    No need to implement your own now, `requests` now has built in support for proxies, e.g. `proxies = { 'https' : 'https://user:password@ip:port' } ; r = requests.get('https://url', proxies=proxies)` see http://docs.python-requests.org/en/latest/user/advanced/ – BurnsBA Jan 20 '16 at 01:17
  • @BurnsBA @MattClimbs @yutaka I can confirm that the use of requests in Python 3 with https and the `user:password@ip:port` works great. – james-see Jan 22 '16 at 22:07
  • 2
    This snippet does not work anymore with recent versions of requests, because it now puts request parameters in a thread local storage. Also, even after having fixed the code, [this will not work for connecting to HTTPS website through proxy](https://github.com/psf/requests/issues/2526#issuecomment-89514502). – Tey' Oct 22 '19 at 03:15
  • So @Tey' is there a way to fix this? (Use the digest in the proxy) – dcalap Jan 30 '20 at 17:18
  • @dcalap Check [the answer](https://stackoverflow.com/a/60031108/5099839) I've just posted – Tey' Feb 02 '20 at 22:15
  • 1
    `proxies = {'https': 'http://user:password@ip:port'}` worked for me. Note `http://`. – omegastripes Sep 15 '21 at 09:03
  • 1
    @omegastripes if it worked for you, it means that it uses Basic Auth, not digest, which is what OP asked – robertspierre Dec 30 '21 at 22:11
  • 1
    @BurnsBA the [requests documentation](https://docs.python-requests.org/en/latest/user/advanced/#proxies) makes clear that the built-in proxy support works only for Basic Auth, not for Digest, which is what OP asked – robertspierre Dec 30 '21 at 22:12
9

I've written a Python module (available here) which makes it possible to authenticate with a HTTP proxy using the digest scheme. It works when connecting to HTTPS websites (through monkey patching) and allows to authenticate with the website as well. This should work with latest requests library for both Python 2 and 3.

The following example fetches the webpage https://httpbin.org/ip through HTTP proxy 1.2.3.4:8080 which requires HTTP digest authentication using user name user1 and password password1:

import requests
from requests_digest_proxy import HTTPProxyDigestAuth

s = requests.Session()
s.proxies = {
        'http': 'http://1.2.3.4:8080/',
        'https': 'http://1.2.3.4:8080/'
}
s.auth = HTTPProxyDigestAuth('user1', 'password1')

print(s.get('https://httpbin.org/ip').text)

Should the website requires some kind of HTTP authentication, this can be specified to HTTPProxyDigestAuth constructor this way:

# HTTP Basic authentication for website
s.auth = HTTPProxyDigestAuth(('user1', 'password1'),
        auth=requests.auth.HTTPBasicAuth('user1', 'password0'))
print(s.get('https://httpbin.org/basic-auth/user1/password0').text))

# HTTP Digest authentication for website
s.auth = HTTPProxyDigestAuth(('user1', 'password1'),,
        auth=requests.auth.HTTPDigestAuth('user1', 'password0'))
print(s.get('https://httpbin.org/digest-auth/auth/user1/password0').text)
robertspierre
  • 3,218
  • 2
  • 31
  • 46
Tey'
  • 961
  • 12
  • 23
3

This snippet works for both types of requests (http and https). Tested on the current version of requests (2.23.0).

import re
import requests
from requests.utils import get_auth_from_url
from requests.auth import HTTPDigestAuth
from requests.utils import parse_dict_header
from urllib3.util import parse_url

def get_proxy_autorization_header(proxy, method):
    username, password = get_auth_from_url(proxy)
    auth = HTTPProxyDigestAuth(username, password)
    proxy_url = parse_url(proxy)
    proxy_response = requests.request(method, proxy_url, auth=auth)
    return proxy_response.request.headers['Proxy-Authorization']


class HTTPSAdapterWithProxyDigestAuth(requests.adapters.HTTPAdapter):
    def proxy_headers(self, proxy):
        headers = {}
        proxy_auth_header = get_proxy_autorization_header(proxy, 'CONNECT')
        headers['Proxy-Authorization'] = proxy_auth_header
        return headers


class HTTPAdapterWithProxyDigestAuth(requests.adapters.HTTPAdapter):
    def proxy_headers(self, proxy):
        return {}

    def add_headers(self, request, **kwargs):
        proxy = kwargs['proxies'].get('http', '')
        if proxy:
            proxy_auth_header = get_proxy_autorization_header(proxy, request.method)
            request.headers['Proxy-Authorization'] = proxy_auth_header



class HTTPProxyDigestAuth(requests.auth.HTTPDigestAuth):

    def init_per_thread_state(self):
        # Ensure state is initialized just once per-thread
        if not hasattr(self._thread_local, 'init'):
            self._thread_local.init = True
            self._thread_local.last_nonce = ''
            self._thread_local.nonce_count = 0
            self._thread_local.chal = {}
            self._thread_local.pos = None
            self._thread_local.num_407_calls = None

    def handle_407(self, r, **kwargs):
        """
        Takes the given response and tries digest-auth, if needed.
        :rtype: requests.Response
        """

        # If response is not 407, do not auth
        if r.status_code != 407:
            self._thread_local.num_407_calls = 1
            return r

        s_auth = r.headers.get('proxy-authenticate', '')

        if 'digest' in s_auth.lower() and self._thread_local.num_407_calls < 2:
            self._thread_local.num_407_calls += 1
            pat = re.compile(r'digest ', flags=re.IGNORECASE)
            self._thread_local.chal = requests.utils.parse_dict_header(
                    pat.sub('', s_auth, count=1))

            # Consume content and release the original connection
            # to allow our new request to reuse the same one.
            r.content
            r.close()
            prep = r.request.copy()
            requests.cookies.extract_cookies_to_jar(prep._cookies, r.request, r.raw)
            prep.prepare_cookies(prep._cookies)

            prep.headers['Proxy-Authorization'] = self.build_digest_header(prep.method, prep.url)
            _r = r.connection.send(prep, **kwargs)
            _r.history.append(r)
            _r.request = prep

            return _r

        self._thread_local.num_407_calls = 1
        return r

    def __call__(self, r):
        # Initialize per-thread state, if needed
        self.init_per_thread_state()
        # If we have a saved nonce, skip the 407
        if self._thread_local.last_nonce:
            r.headers['Proxy-Authorization'] = self.build_digest_header(r.method, r.url)

        r.register_hook('response', self.handle_407)
        self._thread_local.num_407_calls = 1

        return r


session = requests.Session()
session.proxies = {
    'http': 'http://username:password@proxyhost:proxyport',
    'https':  'http://username:password@proxyhost:proxyport'
}
session.trust_env = False

session.mount('http://', HTTPAdapterWithProxyDigestAuth())
session.mount('https://', HTTPSAdapterWithProxyDigestAuth())

response_http = session.get("http://ww3.safestyle-windows.co.uk/the-secret-door/")
print(response_http.status_code)

response_https = session.get("https://stackoverflow.com/questions/13506455/how-to-pass-proxy-authentication-requires-digest-auth-by-using-python-requests")
print(response_https.status_code)

Generally, the problem of proxy autorization is also relevant for other types of authentication (ntlm, kerberos) when connecting using the protocol HTTPS. And despite the large number of issues (since 2013, and maybe there are earlier ones that I did not find):

in requests: Digest Proxy Auth, NTLM Proxy Auth, Kerberos Proxy Auth

in urlib3: NTLM Proxy Auth, NTLM Proxy Auth

and many many others,the problem is still not resolved.

The root of the problem in the function _tunnel of the module httplib(python2)/http.client(python3). In case of unsuccessful connection attempt, it raises an OSError without returning a response code (407 in our case) and additional data needed to build the autorization header. Lukasa gave a explanation here. As long as there is no solution from maintainers of urllib3 (or requests), we can only use various workarounds (for example, use the approach of @Tey' or do something like this).In my version of workaround, we pre-prepare the necessary authorization data by sending a request to the proxy server and processing the received response.

zanuda
  • 176
  • 1
  • 6
  • If putting together an app that would need to work in both environments where Basic _or_ Digest proxy auth might be needed, is there a pre-emptive way to determine whether you need to create/use an auth based on Digest or Basic? – Richard Apr 29 '21 at 01:55
  • If this is authentication on the site, then there is a way. You send a request, receive a response with a 401 code and an indication of what type of authentication is used. Having received this information, you can form a header with authentication data as needed and send a new request with this header. But if we are talking about proxy authentication in requests, then this will not work. If we send a request with incorrect data, we will receive an OSError. And in the text of the error there is no bulet of information about the type of authentication. – zanuda Apr 30 '21 at 15:21
  • Thanks! ...that was my feeling/fear. Is it just me, or is handling proxys, especially when they use authentication (which is the most common scenario in a corporate proxy env) a real problem in requests? As an alternative concept - at least for windows machines - is there any simpler approach that will just piggy-back onto whatever is configured in the OS and use that? – Richard Apr 30 '21 at 18:57
  • Yes, I think this is really a problem of `requests`. The roots of this problem are in the library `http` of `python`. As for simple ways to send a http-request through a proxy for windows - I managed to do it through `curl` for a proxy with authentication Kerberos, there it was enough only to indicate that this type of authentication is used on the proxy-server and after that `curl` itself found all the necessary credentials stored in the system. – zanuda May 02 '21 at 19:29
  • Exactly. I've also put together a script in Powershell that I'm now replacing by Python. In powershell it was trivial - it also figured out what auth method was being used and just used that. ...simple and easy. That this is such a unsupported mess in requests isn't good. – Richard May 02 '21 at 23:58
2

You can use digest authentication by using requests.auth.HTTPDigestAuth instead of requests.auth.HTTPProxyAuth

barracel
  • 1,831
  • 13
  • 24
  • I wanted to pass proxy auth (based on digest auth). That is different from usual digest auth. So I needed to extend HTTPDigestAuth (see below). – yutaka2487 Nov 22 '12 at 22:04
2

For those of you that still end up here, there appears to be a project called requests-toolbelt that has this plus other common but not built in functionality of requests.

https://toolbelt.readthedocs.org/en/latest/authentication.html#httpproxydigestauth

pcreech
  • 334
  • 1
  • 9
2

This works for me. Actually, don't know about security of user:password in this soulution:

import requests
import os

http_proxyf = 'http://user:password@proxyip:port'
os.environ["http_proxy"] = http_proxyf
os.environ["https_proxy"] = http_proxyf

sess = requests.Session()
# maybe need sess.trust_env = True
print(sess.get('https://some.org').text)
Кое Кто
  • 445
  • 5
  • 9
1
import requests
import os


# in my case I had to add my local domain
proxies = {
  'http': 'proxy.myagency.com:8080',
  'https': 'user@localdomain:password@proxy.myagency.com:8080',
}


r=requests.get('https://api.github.com/events', proxies=proxies)
print(r.text)
0

Here is an answer that is not for http Basic Authentication - for example a transperant proxy within organization.

import requests

url      = 'https://someaddress-behindproxy.com'
params   = {'apikey': '123456789'}                     #if you need params
proxies  = {'https': 'https://proxyaddress.com:3128'}  #or some other port
response = requests.get(url, proxies=proxies, params=params)

I hope this helps someone.

Belial
  • 821
  • 1
  • 9
  • 12