3

As far as I've been able to tell cookielib isnt thread safe; but then again the post stating so is five years old, so it might be wrong.

Nevertheless, I've been wondering - If I spawn a class like this:

class Acc:
    jar = cookielib.CookieJar()
    cookie = urllib2.HTTPCookieProcessor(jar)       
    opener = urllib2.build_opener(cookie)

    headers = {}
    def __init__ (self,login,password):
        self.user = login
        self.password = password

    def login(self):
        return False # Some magic, irrelevant

    def fetch(self,url):
        req = urllib2.Request(url,None,self.headers)
        res = self.opener.open(req)
        return res.read()

for each worker thread, would it work? (or is there a better approach?) Each thread would use it's own account; so the fact that workers wouldn't share their cookies is not a problem.

Robus
  • 8,067
  • 5
  • 47
  • 67
  • For reference, the post OP mentions is probably [this](http://bytes.com/topic/python/answers/40838-cookielib-urllib2-thread-safe) one. – Piotr Dobrogost Apr 28 '11 at 21:00

3 Answers3

2

You could see implementation of the library [python_install_path]/lib/cookielib.py to ensure that cookielib.CookieJar is thread safe.

It means if you will share one instance of CookieJar between several connections in different threads, you will not face even inconsistence read of Cookie Set, because CookieJar uses lock self._cookies_lock inside.

Serge S.
  • 4,855
  • 3
  • 42
  • 46
2

You want to use pycurl (the python interface to libcurl). It's thread-safe, supports cookies, https, etc.. The interface is a bit strange, but it just takes a bit of getting used to.

I've only used pycurl w/ HTTPBasicAuth + SSL, but I did find an example using pycurl and cookies here. I believe you'll need to update the pycurl.COOKIEFILE (line 74) and pycurl.COOKIEJAR (line 82) to have some unique name (maybe keying off of id(self.crl)).

As I remember, you'll need to create a new pycurl.Curl() for each request to maintain thread safety.

Sam Dolan
  • 31,966
  • 10
  • 88
  • 84
1

the same question as you. If you do not use pycurl, I think you must urllib2.install_opener(self.opener) before each urllib2.urlopen.

Maybe I should use the pycurl too, urllib2 is not so smart.

okma
  • 11
  • 1