68

How to use the library requests (in python) after a request

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import requests
bot = requests.session()
bot.get('http://google.com')

to keep all the cookies in a file and then restore the cookies from a file.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
agrynchuk
  • 4,597
  • 3
  • 17
  • 17

12 Answers12

94

There is no immediate way to do so, but it's not hard to do.

You can get a CookieJar object from the session with session.cookies, and use pickle to store it to a file.

A full example:

import requests, pickle
session = requests.session()
# Make some calls
with open('somefile', 'wb') as f:
    pickle.dump(session.cookies, f)

Loading is then:

session = requests.session()  # or an existing session

with open('somefile', 'rb') as f:
    session.cookies.update(pickle.load(f))

The requests library uses the requests.cookies.RequestsCookieJar() subclass, which explicitly supports pickling and a dict-like API. The RequestsCookieJar.update() method can be used to update an existing session cookie jar with the cookies loaded from the pickle file.

Rob Bednark
  • 25,981
  • 23
  • 80
  • 125
madjar
  • 12,691
  • 2
  • 44
  • 52
  • 12
    `requests.utils.dict_from_cookiejar` and `requests.utils.cookiejar_from_dict` are not required. They don't save cookies with the same name for different domains and don't save all the required cookies data. I spent a lot of time debugging just because of these. – Elmo Sep 08 '14 at 09:45
  • 1
    pickling a cookiejar doesn't seem to save the host that its associated with. when you load again all cookies are just in the host ''. – MattCochrane Oct 20 '15 at 07:18
  • 9
    @MattClimbs, it is `dict_from_cookiejar` which doesn't save host information. Actually, in current version `session.cookies` can be pickled and unpickled directly, without converting to `dict`. Also, `requests.utils.dict_from_cookiejar` can be replaced with `session.cookies.get_dict()`, and `cookiejar_from_dict` can be replaced with `session.cookies.update(my_dict)`. – MarSoft Dec 13 '16 at 17:21
  • 7
    *Editors note*: I've updated this top answer rather than add a new post. It was close enough but needed updating for API changes made in the intervening 6 years that make this task all that much easier. Do not use the dictionary utilities, they are not needed at all and do not preserve important cookie medatada. I'm fine with posting my own answer if the author wishes to revert my changes; I'd appreciate a heads-up in that case. – Martijn Pieters Oct 05 '18 at 10:38
  • In my case (a python newbie), after `open() as f`, there should be a `f.close()`. Otherwise two consecutive `open()` would rise "TypeError: can't pickle _thread.RLock objects". – Weekend Oct 09 '20 at 02:15
  • 1
    @Weekend, you don't need to explicitly call `f.close()` when using the `with` statement. The _context manager_ implicitly calls `f.close()` for you. See https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files The 2 separate `with` statements in the answer are not nested. – GordonAitchJay Feb 23 '21 at 06:18
36

After a call such as r = requests.get(), r.cookies will return a RequestsCookieJar which you can directly pickle, i.e.

import pickle
def save_cookies(requests_cookiejar, filename):
    with open(filename, 'wb') as f:
        pickle.dump(requests_cookiejar, f)

def load_cookies(filename):
    with open(filename, 'rb') as f:
        return pickle.load(f)

#save cookies
r = requests.get(url)
save_cookies(r.cookies, filename)

#load cookies and do a request
requests.get(url, cookies=load_cookies(filename))

If you want to save your cookies in human-readable format, you have to do some work to extract the RequestsCookieJar to a LWPCookieJar.

import cookielib
def save_cookies_lwp(cookiejar, filename):
    lwp_cookiejar = cookielib.LWPCookieJar()
    for c in cookiejar:
        args = dict(vars(c).items())
        args['rest'] = args['_rest']
        del args['_rest']
        c = cookielib.Cookie(**args)
        lwp_cookiejar.set_cookie(c)
    lwp_cookiejar.save(filename, ignore_discard=True)

def load_cookies_from_lwp(filename):
    lwp_cookiejar = cookielib.LWPCookieJar()
    lwp_cookiejar.load(filename, ignore_discard=True)
    return lwp_cookiejar

#save human-readable
r = requests.get(url)
save_cookies_lwp(r.cookies, filename)

#you can pass a LWPCookieJar directly to requests
requests.get(url, cookies=load_cookies_from_lwp(filename))
dtheodor
  • 4,894
  • 3
  • 22
  • 27
29

I offer a way by json:

to save cookie -

import json
with open('cookie.txt', 'w') as f:
    json.dump(requests.utils.dict_from_cookiejar(bot.cookies), f)

and to load cookie -

import json
session = requests.session()  # or an existing session

with open('cookie.txt', 'r') as f:
    cookies = requests.utils.cookiejar_from_dict(json.load(f))
    session.cookies.update(cookies)
Emil
  • 629
  • 2
  • 7
  • 24
Lewis Livermore
  • 299
  • 3
  • 2
24

Expanding on @miracle2k's answer, requests Sessions are documented to work with any cookielib CookieJar. The LWPCookieJar (and MozillaCookieJar) can save and load their cookies to and from a file. Here is a complete code snippet which will save and load cookies for a requests session. The ignore_discard parameter is used to work with httpbin for the test, but you may not want to include it your in real code.

import os
from cookielib import LWPCookieJar

import requests


s = requests.Session()
s.cookies = LWPCookieJar('cookiejar')
if not os.path.exists('cookiejar'):
    # Create a new cookies file and set our Session's cookies
    print('setting cookies')
    s.cookies.save()
    r = s.get('http://httpbin.org/cookies/set?k1=v1&k2=v2')
else:
    # Load saved cookies from the file and use them in a request
    print('loading saved cookies')
    s.cookies.load(ignore_discard=True)
    r = s.get('http://httpbin.org/cookies')
print(r.text)
# Save the session's cookies back to the file
s.cookies.save(ignore_discard=True)
gazpachoking
  • 261
  • 2
  • 6
10

I found that the other answers had problems:

  • They didn't apply to sessions.
  • They didn't save and load properly. Only the cookie name and value was saved, the expiry date, domain name, etc. was all lost.

This answer fixes these two issues:

import requests.cookies

def save_cookies(session, filename):
    if not os.path.isdir(os.path.dirname(filename)):
        return False
    with open(filename, 'w') as f:
        f.truncate()
        pickle.dump(session.cookies._cookies, f)


def load_cookies(session, filename):
    if not os.path.isfile(filename):
        return False

    with open(filename) as f:
        cookies = pickle.load(f)
        if cookies:
            jar = requests.cookies.RequestsCookieJar()
            jar._cookies = cookies
            session.cookies = jar
        else:
            return False

Then just call save_cookies(session, filename) to save or load_cookies(session, filename) to load. Simple as that.

MattCochrane
  • 2,900
  • 2
  • 25
  • 35
  • 1
    This worked for me without the os.path.isdir lines in each function. Calling open() creates the file if it doesn't exist, and writes correctly for my use-case. – Rontron Sep 30 '18 at 21:47
  • This works well, but needs a small update for Python 3. Pickling creates binary data, so the open calls should be: open(filename, "wb") and open(filename, "rb") respectively. – djkrause Oct 07 '22 at 19:48
9

This will do the job:

session.cookies = LWPCookieJar('cookies.txt')

The CookieJar API requires you to call load() and save() manually though. If you do not care about the cookies.txt format, I have a ShelvedCookieJar implementation that will persist on change.

miracle2k
  • 29,597
  • 21
  • 65
  • 64
  • 3
    This answer is missing a few steps. Here is the full code: `cj = cookielib.LWPCookieJar(cookie_file)` `cj.load()` `session.cookies = cj` – ChaimG Jul 29 '15 at 20:09
  • I tried using ShelvedCookieJar, but I found that the results don't get flushed to the disk unless `jar.shelf.close()` is called and the jar isn't usable after `jar.shelf.close()` is called, so managing the lifecycle of the jar is not straightforward. – Jason R. Coombs Dec 01 '22 at 14:34
5

code for python 3

Note that the great majority of cookies on the Internet are Netscape cookies. so if you want to save cookies to disk in the Mozilla cookies.txt file format (which is also used by the Lynx and Netscape browsers) use MozillaCookieJar

from http.cookiejar import MozillaCookieJar
import requests

s = requests.Session()
s.cookies = MozillaCookieJar('cookies.txt')
# or s.cookies = MozillaCookieJar() and later use s.cookies.filename = 'cookies.txt' or pass the file name to save method.

response = s.get('https://www.msn.com')

s.cookies.save()

the file is overwritten if it already exists, thus wiping all the cookies it contains. Saved cookies can be restored later using the load() or revert() methods.

Note that the save() method won’t save session cookies anyway, unless you ask otherwise by passing a true ignore_discard argument.

s.cookies.save(ignore_discard=True)

using load method:

load cookies from a file.

Old cookies are kept unless overwritten by newly loaded ones.

s.cookies.load()

using revert method:

Clear all cookies and reload cookies from a saved file.

s.cookies.revert()

you may need also to pass a true ignore_discard argument in load or revert methods.

note about using MozillaCookieJar :

Note This loses information about RFC 2965 cookies, and also about newer or non-standard cookie-attributes such as port.

more reading

Sameh Farouk
  • 549
  • 4
  • 8
4

You can pickle the cookies object directly:

cookies = pickle.dumps(session.cookies)

The dict representation misses a lot of informations: expiration, domain, path...

It depends on the usage you intend to do with the cookies, but if you don't have informations about the expiration, for example, you should implement the logic to track expiration by hand.

Pickling the library returned object lets you easily reconstruct the state, then you can relay on the library implementation.

Obviously, this way, the consumer of the pickled object needs to use the same library

JackNova
  • 3,911
  • 5
  • 31
  • 49
4

Simple way to convert cookies into list of dicts and save to json or db. This is methods of class which have session attribute.

def dump_cookies(self):
    cookies = []
    for c in self.session.cookies:
        cookies.append({
            "name": c.name,
            "value": c.value,
            "domain": c.domain,
            "path": c.path,
            "expires": c.expires
        })
    return cookies

def load_cookies(self, cookies):
    for c in cookies:
        self.session.cookies.set(**c)

All we need is five parameters such as: name, value, domain, path, expires

Mikhail Bulygin
  • 245
  • 2
  • 3
2

dtheodor's answer got 95% there, except this:

session = requests.session(cookies=cookies)

For me this raises an exception saying session() does not takes arguments.

I worked around it by taking the keys/values on the cookie.get_dict and adding them manually to the session using:

session.cookies.set(cookies.keys()[n],cookies.values()[n])
gtalarico
  • 4,409
  • 1
  • 20
  • 42
1

For all the solutions that use Cookielib, in Python 3.0 its been changed to http.cookiejar Please lookup Python 3.2 won't import cookielib

Zakir Ayub
  • 105
  • 8
1

Inspired by @miracle2k answer, I implemented jaraco.net.http.cookies.ShelvedCookieJar. I started out using the same shelve.Shelf-backed store, but later found that interface to be insufficiently portable and inspectable, so I created a bespoke Shelf class backed by jsonpickle. This approach has nice portability and concurrency behaviors but also transparency (the filename is inspectable and the cookies file is human and machine readable). Usage is trivially simple:

from jaraco.net.http.cookies import ShelvedCookieJar

session = requests.Session()
session.cookies = ShelvedCookieJar.create()

Thereafter, cookies are persisted in ./cookies.json. No manual management of when to persist values. No manual opening of files. Simply create and use.

Jason R. Coombs
  • 41,115
  • 10
  • 83
  • 93