1

Basically i need a program that given a URL, it downloads a file and saves it. I know this should be easy but there are a couple of drawbacks here...

First, it is part of a tool I'm building at work, I have everything else besides that and the URL is HTTPS, the URL is of those you would paste in your browser and you'd get a pop up saying if you want to open or save the file (.txt).

Second, I'm a beginner at this, so if there's info I'm not providing please ask me. :)

I'm using Python 3.3 by the way.

I tried this:

import urllib.request
response = urllib.request.urlopen('https://websitewithfile.com')
txt = response.read()
print(txt)

And I get:

urllib.error.HTTPError: HTTP Error 401: Authorization Required

Any ideas? Thanks!!

jfs
  • 399,953
  • 195
  • 994
  • 1,670
user3063129
  • 13
  • 1
  • 1
  • 3
  • I have same problem....in browser content loaded...and you can see content but in python we first send authorization header ..then we can't see any content and see 401 error – Mostafa Aug 17 '18 at 18:40

4 Answers4

6

You can do this easily with the requests library.

import requests
response = requests.get('https://websitewithfile.com/text.txt',verify=False, auth=('user', 'pass'))
print(response.text)

to save the file you would type

with open('filename.txt','w') as fout:
   fout.write(response.text):

(I would suggest you always set verify=True in the resquests.get() command)

Here is the documentation:

Back2Basics
  • 7,406
  • 2
  • 32
  • 45
  • ok, this gave me: 'requests.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:550)' any thoughts on how to deal with this? – user3063129 Dec 04 '13 at 00:04
  • wow! i have added "verify=False" on line 2, and i get as output "response [200]", i change my question to: how get the file after that? – user3063129 Dec 04 '13 at 00:26
  • what if it is a Post? it doesn't have the same params – RollRoll May 18 '18 at 13:53
2

Doesn't the browser also ask you to sign in? Then you need to repeat the request with the added authentication like this:

Community
  • 1
  • 1
ajm475du
  • 361
  • 1
  • 5
2

If you don't have Requests module, then the code below works for python 2.6 or later. Not sure about 3.x

import urllib

testfile = urllib.URLopener()
testfile.retrieve("https://randomsite.com/file.gz", "/local/path/to/download/file")
0

You can try this solution: https://github.qualcomm.com/graphics-infra/urllib-siteminder

import siteminder
import getpass
url = 'https://XYZ.dns.com'
r = siteminder.urlopen(url, getpass.getuser(), getpass.getpass(), "dns.com")
Password:<Enter Your Password>

data = r.read() / pd.read_html(r.read())  # need to import panda as pd for the second one
imankalyan
  • 155
  • 1
  • 8