How can I login to a website with Python?

Question

How can I do it? I was trying to enter some specified link (with urllib), but to do it, I need to log in.

I have this source from the site:

<form id="login-form" action="auth/login" method="post">
    <div>
    <!--label for="rememberme">Remember me</label><input type="checkbox" class="remember" checked="checked" name="remember me" /-->
    <label for="email" id="email-label" class="no-js">Email</label>
    <input id="email-email" type="text" name="handle" value="" autocomplete="off" />
    <label for="combination" id="combo-label" class="no-js">Combination</label>
    <input id="password-clear" type="text" value="Combination" autocomplete="off" />
    <input id="password-password" type="password" name="password" value="" autocomplete="off" />
    <input id="sumbitLogin" class="signin" type="submit" value="Sign In" />

Is this possible?

score 80 · Accepted Answer · edited Jul 21 '20 at 06:53

80

Maybe you want to use twill. It's quite easy to use and should be able to do what you want.

It will look like the following:

from twill.commands import *
go('http://example.org')

fv("1", "email-email", "blabla.com")
fv("1", "password-clear", "testpass")

submit('0')

You can use showforms() to list all forms once you used go… to browse to the site you want to login. Just try it from the python interpreter.

edited Jul 21 '20 at 06:53

Aaron Chamberlain

653
2
10
26

answered May 26 '10 at 05:38

sloth

99,095
21
171
219

note that in some cases you need to use submit(). see: http://lists.idyll.org/pipermail/twill/2006-August/000526.html I confirm this issue, for me, logging into www.pge.com, using submit() works. – user391339 Sep 11 '14 at 07:47
2

Is there a solution for Python 3.6? It seems like twill doesn't support Python 3.5 nor 3.6. I tried downloading it and converting it using `2to3` but now I get a `ModuleNotFoundError` when trying to import it. – stefanbschneider Aug 02 '17 at 11:04
Actually, I could resolve the `ModuleNotFoundError` by using/converting Twill 1.8.0 and installing `lxml` and `requests` with `pip install`. But now I get a `SyntaxError` when I try to import because somewhere `False = 0`.... – stefanbschneider Aug 02 '17 at 11:18
2

It's kind of a pain to fix it, but it works: https://stackoverflow.com/a/45459994/2745116 – stefanbschneider Aug 02 '17 at 12:55
Does it work with HTTPs sites or I have to do something like [this](https://stackoverflow.com/a/39359295/1317018)? – Mahesha999 Jul 12 '18 at 14:20
webbot is much easier to use : https://stackoverflow.com/a/51170181/6665568 – Natesh bhat Feb 04 '20 at 03:41

Tarun Venugopal Nair · Answer 2 · 2016-03-09T08:31:03.507

66

Let me try to make it simple, suppose URL of the site is www.example.com and you need to sign up by filling username and password, so we go to the login page say http://www.example.com/login.php now and view it's source code and search for the action URL it will be in form tag something like

 <form name="loginform" method="post" action="userinfo.php">

now take userinfo.php to make absolute URL which will be 'http://example.com/userinfo.php', now run a simple python script

import requests
url = 'http://example.com/userinfo.php'
values = {'username': 'user',
          'password': 'pass'}

r = requests.post(url, data=values)
print r.content

I Hope that this helps someone somewhere someday.

edited Mar 09 '16 at 08:31

answered Feb 20 '15 at 12:01

Tarun Venugopal Nair

1,323
10
8

this does not work for most of the websites that i tried – Anurag Pandey Aug 26 '16 at 03:39
Out of the two dozen help/stackoverflow pages I looked at this was the only solution that worked on the one site I needed. – Buoy Apr 19 '17 at 02:10
best choice for web automation is webbot.https://stackoverflow.com/a/51170181/6665568 – Natesh bhat Apr 08 '19 at 07:27
Are all values always username & password? I don't think this seems to be working for my chosen site. – Dylan Logan May 17 '19 at 10:42
@DylanLogan You always have to inspect what the actual webpage sends to the server and adapt your script to it. The server should not be able to distinguish between your script and the web browser. – Jeyekomon Jul 12 '19 at 11:17

score 31 · Answer 3 · answered May 26 '10 at 06:19

Typically you'll need cookies to log into a site, which means cookielib, urllib and urllib2. Here's a class which I wrote back when I was playing Facebook web games:

import cookielib
import urllib
import urllib2

# set these to whatever your fb account is
fb_username = "your@facebook.login"
fb_password = "secretpassword"

class WebGamePlayer(object):

    def __init__(self, login, password):
        """ Start up... """
        self.login = login
        self.password = password

        self.cj = cookielib.CookieJar()
        self.opener = urllib2.build_opener(
            urllib2.HTTPRedirectHandler(),
            urllib2.HTTPHandler(debuglevel=0),
            urllib2.HTTPSHandler(debuglevel=0),
            urllib2.HTTPCookieProcessor(self.cj)
        )
        self.opener.addheaders = [
            ('User-agent', ('Mozilla/4.0 (compatible; MSIE 6.0; '
                           'Windows NT 5.2; .NET CLR 1.1.4322)'))
        ]

        # need this twice - once to set cookies, once to log in...
        self.loginToFacebook()
        self.loginToFacebook()

    def loginToFacebook(self):
        """
        Handle login. This should populate our cookie jar.
        """
        login_data = urllib.urlencode({
            'email' : self.login,
            'pass' : self.password,
        })
        response = self.opener.open("https://login.facebook.com/login.php", login_data)
        return ''.join(response.readlines())

You won't necessarily need the HTTPS or Redirect handlers, but they don't hurt, and it makes the opener much more robust. You also might not need cookies, but it's hard to tell just from the form that you've posted. I suspect that you might, purely from the 'Remember me' input that's been commented out.

score 23 · Answer 4 · edited Jun 20 '20 at 09:12

23

Web page automation ? Definitely "webbot"

webbot even works web pages which have dynamically changing id and classnames and has more methods and features than selenium or mechanize.

Here's a snippet :)

from webbot import Browser 
web = Browser()
web.go_to('google.com') 
web.click('Sign in')
web.type('mymail@gmail.com' , into='Email')
web.click('NEXT' , tag='span')
web.type('mypassword' , into='Password' , id='passwordFieldId') # specific selection
web.click('NEXT' , tag='span') # you are logged in ^_^

The docs are also pretty straight forward and simple to use : https://webbot.readthedocs.io

edited Jun 20 '20 at 09:12

Community

1
1

answered Jul 04 '18 at 09:22

Natesh bhat

12,274
10
84
125

This examlpe works great. Will this also work where `autocomplete=off`.? – S Andrew Aug 27 '18 at 08:58
not install on win 64 bit. Error: `Could not find a version that satisfies the requirement webbot (from versions: 0.0.1.win-amd64)` – Mostafa Dec 27 '18 at 10:22
Try using python3 – Natesh bhat Dec 28 '18 at 03:48
How to handle iframe in webbot.?..i mean i have to close an iframe which popups up after page is loaded.. – arihanth jain Apr 01 '20 at 10:09

score 19 · Answer 5 · edited Oct 26 '16 at 07:11

import cookielib
import urllib
import urllib2

url = 'http://www.someserver.com/auth/login'
values = {'email-email' : 'john@example.com',
          'password-clear' : 'Combination',
          'password-password' : 'mypassword' }

data = urllib.urlencode(values)
cookies = cookielib.CookieJar()

opener = urllib2.build_opener(
    urllib2.HTTPRedirectHandler(),
    urllib2.HTTPHandler(debuglevel=0),
    urllib2.HTTPSHandler(debuglevel=0),
    urllib2.HTTPCookieProcessor(cookies))

response = opener.open(url, data)
the_page = response.read()
http_headers = response.info()
# The login cookies should be contained in the cookies variable

For more information visit: https://docs.python.org/2/library/urllib2.html

score 9 · Answer 6 · answered May 26 '10 at 05:27

Websites in general can check authorization in many different ways, but the one you're targeting seems to make it reasonably easy for you.

All you need is to POST to the auth/login URL a form-encoded blob with the various fields you see there (forget the labels for, they're decoration for human visitors). handle=whatever&password-clear=pwd and so on, as long as you know the values for the handle (AKA email) and password you should be fine.

Presumably that POST will redirect you to some "you've successfully logged in" page with a Set-Cookie header validating your session (be sure to save that cookie and send it back on further interaction along the session!).

score 5 · Answer 7 · edited May 25 '20 at 11:55

5

For HTTP things, the current choice should be: Requests- HTTP for Humans

edited May 25 '20 at 11:55

jtessier72

412
3
8

answered Dec 15 '13 at 02:53

Andrew_1510

12,258
9
51
52

How can I login to a website with Python?

7 Answers7

Web page automation ? Definitely "webbot"

Linked