0

I using python want to get the raw HTML of a webpage that requires authentication.

Similar to this question but the answers here do not work.

Code I am trying:

import urllib, urllib2, cookielib

username = 'redacted'
password = 'redacted'

cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'username' : username, 'j_password' : password})
opener.open('https://redacted.net', login_data)#http://www.example.com/login.php
resp = opener.open('https://redacted.net')#http://www.example.com/hiddenpage.php
print resp.read() #print strait HTML of the page can use opener to view any page using your session cookie.

Error:

   Traceback (most recent call last):
  File "C:/Users/Jacob/Desktop/School/Python_Scripts/session refresher/session_refresher.py", line 9, in <module>
    opener.open('Redacted', login_data)#http://www.example.com/login.php
  File "C:\Python27\lib\urllib2.py", line 437, in open
    response = meth(req, response)
  File "C:\Python27\lib\urllib2.py", line 550, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python27\lib\urllib2.py", line 475, in error
    return self._call_chain(*args)
  File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 558, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 401: Unauthorized

Here is what the window that popups to ask for authentication when I go to the webpage with a browser.

Popup that ask to authenticate

Community
  • 1
  • 1
GJH105775
  • 45
  • 9

2 Answers2

3

I'd use requests for this as it is simpler than what urllib provides for authentication.

import requests
r = requests.get("https://redacted.net", auth=('username', 'password'))
print(r.text)
ZetaRift
  • 332
  • 1
  • 9
1

use requests and supply your user/pass pair in the request:

import requests

requests.get('https://redacted.net', auth=('user', 'pass'))
midori
  • 4,807
  • 5
  • 34
  • 62