2

I'm implementing a simple upload handler in Python which reads an uploaded file in chunks into memory, GZips and signs them, and reuploads them to another server for long term storage. I've already devised a way to read the upload in chunks with my web server, and essentially I have a workflow like this:

class MyUploadHandler:

    def on_file_started(self, file_name):
        pass

    def on_file_chunk(self, chunk):
        pass

    def on_file_finished(self, file_size):
        pass

This part works great.

Now I need to upload the file in chunks to the final destination after performing my modifications to them. I'm looking for a workflow somewhat like this:

import requests

class MyUploadHandler:

    def on_file_started(self, file_name):
        self.request = requests.put("http://secondaryuploadlocation.com/upload/%s" %
                (file_name,), streaming_upload = True)

    def on_file_chunk(self, chunk):
        self.request.write_body(transform_chunk(chunk))

    def on_file_finished(self, file_size):
        self.request.finish()

Is there a way to do this using the Python requests library? It seems that they allow for file-like upload objects which can be read, but I'm not sure exactly what that means and how to apply it for my situation. How can I provide a streaming upload request like this?

Naftuli Kay
  • 87,710
  • 93
  • 269
  • 411
  • You need to provide a [generator for chunked uploads](http://docs.python-requests.org/en/latest/user/advanced/#chunk-encoded-requests), but that *pulls* data; your code wants to push instead. It'll require a separate thread and a queue, I think. – Martijn Pieters Mar 21 '13 at 21:12
  • Any ideas on where to start with that? My Python threading isn't so good. – Naftuli Kay Mar 21 '13 at 21:18
  • This partly depends on the web framework you are using for this and how it handles concurrency. – Martijn Pieters Mar 21 '13 at 21:25
  • Should I just generate a rogue thread for each upload in order to simultaneously re-upload the file? I'm kind of at a loss for where even to begin implementing this. Could you give a simplified example outside the context of a web application for converting pushed data into pullable data via a generator method? – Naftuli Kay Mar 22 '13 at 02:31
  • Possible duplicate of [How to stream POST data into Python requests?](http://stackoverflow.com/questions/40015869/how-to-stream-post-data-into-python-requests) – vog Oct 14 '16 at 06:59

2 Answers2

0

I would suggest using multiprocessing module of Python. You can use the apply_async routine in that module to upload each chunk as they are completed without affecting the other uploads. You can then put them in a temporary folder and after the upload event completion, you can sew them together.

Sandip Sinha
  • 49
  • 1
  • 1
-1

The following answer to a similar question should solve your problem:

Q: "How to stream POST data into Python requests?"

A: Example code using queue, threading and iter() with sentinel

https://stackoverflow.com/a/40018547/19163

Community
  • 1
  • 1
vog
  • 23,517
  • 11
  • 59
  • 75
  • This should be a comment, not an answer. If it is a duplicate question then [vote to close](http://stackoverflow.com/help/privileges/close-questions) as such and/or leave a comment once you [earn](http://meta.stackoverflow.com/q/146472) enough [reputation](http://stackoverflow.com/help/whats-reputation). If the question is not a duplicate then tailor the answer to this specific question. – Petter Friberg Oct 14 '16 at 06:56
  • I thought that marking it as duplicate would be too harsh, but I'll do as you suggest. – vog Oct 14 '16 at 06:58