0

Is there any way to enable some kind of verbose logging for urllib? I'm especially trying to find out which TLS-Cert files its using and which proxy its using. I.e. if it is actually using what I configured in the env.

Philippe
  • 1,715
  • 4
  • 25
  • 49

1 Answers1

0

In Python version 3.5.1 and earlier, you can do this two ways:

  1. You can use the constructor argument for HTTPHandler and HTTPSHandler (as demonstrated in this SO answer):

    import urllib.request
    
    handler = urllib.request.HTTPHandler(debuglevel=10)
    opener = urllib.request.build_opener(handler)
    content = opener.open('http://example.com').read()
    
    print(content[0:120]) 
    
  2. You can set the http.client.HTTPConnection.debuglevel class variable to enable logging for all future connections.

    import urllib.request
    import http.client
    
    http.client.HTTPConnection.debuglevel = 1
    content = urllib.request.urlopen('http://example.com').read()
    
    print(content[0:120])
    

In Python version 3.5.2 and later, the second method no longer works (the first one still works fine though). To use the http.client.HTTPConnection.debuglevel class variable, you will need to monkey patch the __init__ methods of HTTPHandler and HTTPSHandler like so (at least until this PR is merged and back-ported):

https_old_init = urllib.request.HTTPSHandler.__init__

def https_new_init(self, debuglevel=None, context=None, check_hostname=None):
    debuglevel = debuglevel if debuglevel is not None else http.client.HTTPSConnection.debuglevel
    https_old_init(self, debuglevel, context, check_hostname)

urllib.request.HTTPSHandler.__init__ = https_new_init

http_old_init = urllib.request.HTTPHandler.__init__

def http_new_init(self, debuglevel=None):
    debuglevel = debuglevel if debuglevel is not None else http.client.HTTPSConnection.debuglevel
    http_old_init(self, debuglevel)

urllib.request.HTTPHandler.__init__ = http_new_init

(Note: I don't recommend setting the debuglevel in HTTPHandler's as a method argument default value because the default values for method arguments get evaluated at function definition evaluation time, which, for HTTPHandler's constructor, is when the module urllib.request is imported.)

The reason why you have to do this (if you want to use the http.client.HTTPConnection.debuglevel class variable as a global value) is because of a change that was introduced beginning Python 3.5.2 that sets the http.client.HTTPConnection.debuglevel instance variable (which normally just shadows the respective class variable) to the whatever the value is for the debuglevel constructor argument on the HTTPHandler and HTTPSHandler classes, irrespective of whether the argument is set or not. Because it defaults to 0, the HTTPConnection.debuglevel instance variable will always be overridden, by either the value passed to the constructor, or the default, 0.

wheeler
  • 2,823
  • 3
  • 27
  • 43