2

In a django app I'm using a third-party Python script to allow users to upload files to blip.tv through httplib.HTTPConnection.send on an EC2 instance. Since these files are generally large I will use a message queue to process the upload asynchronously (RabbitMQ/Celery), and give feedback on the progress to the user in the frontend.

The httpconnection and send are done in this part of the script:

host, selector = urlparts = urlparse.urlsplit(url)[1:3]
h = httplib.HTTPConnection(host)
h.putrequest("POST", selector)
h.putheader("content-type", content_type)
h.putheader("content-length", len(data))
h.endheaders()
h.send(data)    
response = h.getresponse()
return response.status, response.reason, response.read()  

The getresponse() is returned after the file transfer is completed, how do I write out the progress (assuming something with stdout.write) so I can write this value to the cache framework for display (djangosnippets 678/679)? Alternatively if there is a better practice for this, I'm all ears!

EDIT:

Since I've gone with urllib2 and used a tip from this question to override the read() of the file to get the upload progress. Furthermore I'm using poster to generate the multipart urlencode. Here's the latest code:

from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
def Upload(video_id, username, password, title, description, filename):

    class Progress(object):
        def __init__(self):
            self._seen = 0.0

        def update(self, total, size, name):
            self._seen += size
            pct = (self._seen / total) * 100.0
            print '%s progress: %.2f' % (name, pct)

    class file_with_callback(file):
        def __init__(self, path, mode, callback, *args):
            file.__init__(self, path, mode)
            self.seek(0, os.SEEK_END)
            self._total = self.tell()
            self.seek(0)
            self._callback = callback
            self._args = args

        def __len__(self):
            return self._total

        def read(self, size):
            data = file.read(self, size)
            self._callback(self._total, len(data), *self._args)
            return data

    progress = Progress()
    stream = file_with_callback(filename, 'rb', progress.update, filename)

    datagen, headers = multipart_encode({
                                        "post": "1",
                                        "skin": "xmlhttprequest",
                                        "userlogin": "%s" % username,
                                        "password": "%s" % password,
                                        "item_type": "file",
                                        "title": "%s" % title.encode("utf-8"),
                                        "description": "%s" % description.encode("utf-8"),                                             
                                         "file": stream
                                         })    

    opener = register_openers()

    req = urllib2.Request(UPLOAD_URL, datagen, headers)
    response = urllib2.urlopen(req)
    return response.read()

This works though only for file path inputs, rather than an InMemoryUploadedFile that comes from a form input (request.FILES), since it is trying to read the file already saved in memory I suppose, and I get a TypeError on line: "stream = file_with_callback(filename, 'rb', progress.update, filename)":

coercing to Unicode: need string or buffer, InMemoryUploadedFile found

How can I achieve the same progress reporting with the user-uploaded file? Also, will this consume alot of memory reading out the progress like this, perhaps an upload solution to this download progress for urllib2 will work better, but how to implement... Help is sooo welcome

Community
  • 1
  • 1
HdN8
  • 2,953
  • 3
  • 24
  • 26

1 Answers1

1

It turns out that the poster library has a callback hook in multipart_encode, which can be used to get the progress out (upload or download). Good stuff...

Although i suppose I technically answered this question, I'm sure there are other ways to skin this cat, so i'll post more if I find other methods or details for this.

Here's the code:

def prog_callback(param, current, total):
    pct = 100 - ((total - current ) *100 )/ (total) 
    print "Progress: %s " % pct    


datagen, headers = multipart_encode({
                                    "post": "1",
                                    "skin": "xmlhttprequest",
                                    "userlogin": "%s" % username,
                                    "password": "%s" % password,
                                    "item_type": "file",
                                    "title": "%s" % title.encode("utf-8"),
                                    "description": "%s" % description.encode("utf-8"),                                             
                                     "file": filename
                                     }, cb=prog_callback)    

opener = register_openers()

req = urllib2.Request(UPLOAD_URL, datagen, headers)
response = urllib2.urlopen(req)
return response.read()
HdN8
  • 2,953
  • 3
  • 24
  • 26