-1

I have a very simple problem and I am absolutely amazed that I haven't seen anything on this specifically. I am attempting to follow best practices for copying a file that is hosted on a webserver going through a proxy server (which does not require auth) using python3.

i have done similar things using python 2.5 but I am really coming up short here. I am trying to make this into a function that i can reuse for future scripts on this network. any assistance that can be provided would be greatly appreciated.

I have the feeling that my issue lies within attempting to use urllib.request or http.client without any clear doc on how to incorporate the use of a proxy (without auth).

I've been looking here and pulling out my hair... http://docs.python.org/3.1/library/urllib.request.html#urllib.request.ProxyHandler http://docs.python.org/3.1/library/http.client.html http://diveintopython3.org/http-web-services.html

even this stackoverflow article: Proxy with urllib2

but in python3 urllib2 is deprecated...

Community
  • 1
  • 1
MadSc13ntist
  • 19,820
  • 8
  • 25
  • 19
  • 4
    urllib2 is not deprecated. It was merged together with urlparse and robotparse into urllib. All functionality provided by urllib2 is now in urllib. – nosklo Nov 23 '09 at 17:07
  • I'll upvote you if you fix the error in your post, it's really a good question except for the bit about depreciation – Sheena Feb 03 '13 at 08:19

1 Answers1

1

here is an function to retrieve a file through an http proxy:

import urllib.request

def retrieve( url, filename ):
    proxy = urllib.request.ProxyHandler( {'http': '127.0.0.1'} )
    opener = urllib.request.build_opener( proxy )
    remote = opener.open( url )
    local = open( filename, 'wb' )
    data = remote.read(100)
    while data:
        local.write(data)
        data = remote.read(100)
    local.close()
    remote.close()

(error handling is left as an exercise to the reader...)

you can eventually save the opener object for later use, in case you need to retrieve multiple files. the content is written as-is into the file, but it may need to be decoded if a fancy encoding has been used.

Adrien Plisson
  • 22,486
  • 6
  • 42
  • 73