2

I am developing an application which uses a series of REST calls to retrieve data. I have the basic application logic complete and the structure for data retrieval is roughly as follows.

1) the initial data call is completed

2) for each response in the initial call a subsequent data call is performed to a rest service requiring basic authentication.

Performing these calls in sequential order can add up to a long wait time by the end user, I am therefore trying to implement threading to speed up the process (being IO bound makes this an ideal candidate for threading). The problem is I am having problems with the authentication on the threaded calls.

If I perform the calls sequentially then everything works fine but if I set it up with the threaded approach I end up with 401 authentication errors or 500 internal server errors from the server.

I have talked to the REST service admins and they know of nothing that would prevent concurrent connections from the same user on the server end so I am wondering if this is an issue on the urllib2 end.

Does anyone have any experience with this?

EDIT:

While I am unable to post the exact code I will post a reasonable representation of what I am doing with very similar structure.

import threading
class UrlThread(threading.Thread):
    def __init__(self, data):
        threading.Thread.__init__(self)
        self.data = data

    def run(self):
        password_manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
        password_manager.add_password(None, 'https://url/to/Rest_Svc/', 'uid', 'passwd')
        auth_manager = urllib2.HTTPBasicAuthHandler(password_manager)
        opener = urllib2.build_opener(auth_manager)
        urllib2.install_opener(opener)
        option = data[0]
        urlToOpen = 'https://url/to/Rest_Svc/?option='+option
        rawData = urllib2.urlopen(urlToOpen)
        wsData = rawData.readlines()
        if wsData:
            print('success')

#firstCallRows is a list of lists containing the data returned 
#from the initial call I mentioned earlier.
thread_list = []
for row in firstCallRows:
    t = UrlThread(row)
    t.setDaemon(True)
    t.start()
    thread_list.append(t)

for thread in thread_list:
    thread.join()
ntlarson
  • 181
  • 1
  • 9
  • Can you post the code that you are using? I've used multiprocessing several times for similar tasks and have some example code, but I don't want to post it if it won't be of help. – Nolen Royalty Apr 13 '12 at 14:21
  • possible duplicate of [Are urllib2 and httplib thread safe?](http://stackoverflow.com/questions/5825151/are-urllib2-and-httplib-thread-safe) – Ignacio Vazquez-Abrams Apr 13 '12 at 15:06
  • Ignacio, I had seen that post and considered this a separate topic due to the fact that the common example of threading in python is done with urllib2 and I can thread urllib2 fine so long as I am not including authentication. My question was/is more specific to the nature of threading urllib2 when authentication is required. However if this is deemed a copy, then I apologize. – ntlarson Apr 13 '12 at 15:36

1 Answers1

0

With Requests you could do something like this:

from requests import session, async

auth = ('username', 'password')
url = 'http://example.com/api/'
options = ['foo1', 'foo2', 'foo3']

s = session(auth=auth)

rs = [async.get(url, params={'option': opt}, session=s) for opt in options]

responses = async.imap(rs)

for r in responses:
    print r.text

Relevant documentation:
Sessions
Asynchronous requests
Basic authentication

Acorn
  • 49,061
  • 27
  • 133
  • 172
  • I had seen Requests a while ago when I was first writing this code (before revisiting it for threading) and I just now stumbled across it again. I am going to give it a try to see if threading with it will work.... though the fact that it has a built in async functionality is really nice too. Thanks for the post. – ntlarson Apr 13 '12 at 15:30
  • No problem. I just switched it over to use `async.imap` instead of `async.map`. That way you get a generator that gives you the requests as they complete, instead of blocking until all the requests have completed. – Acorn Apr 13 '12 at 15:33