2

Presently I have a GAE app that does some offline processing (backs up a user's data), and generates a file that's somewhere in the neighbourhood of 10 - 100 MB. I'm not sure of the best way to serve this file to the user. The two options I'm considering are:

  1. Adding some code to the offline processing code that 'spoofs' it as a form upload to the blob store, and going thru the normal blobstore process to serve the file.

  2. Having the offline processing code store the file somewhere off of GAE, and serving it from there.

Is there a much better approach I'm overlooking? I'm guessing this is functionality that isn't well suited to GAE. I had thought of storing in the datastore as db.Text or Dd.Blob but there I encounter the 1 MB limit.

Any input would be appreciated,

John Carter
  • 6,752
  • 2
  • 32
  • 53

4 Answers4

3

I think storing it in the blobstore via a form post is your best currently-available option. We have plans to implement programmatic blobstore writing, but it's not ready quite yet.

Nick Johnson
  • 100,655
  • 16
  • 128
  • 198
  • I don't believe this will work, since you can't have a urlfetch payload of greater than 1MB (at least according to the docs), and certainly not up to 100MB. – Wooble Jan 18 '11 at 13:38
  • where does urlfetch come into it? programmatic writing to blobstore is available now http://code.google.com/appengine/docs/python/blobstore/overview.html#Writing_Files_to_the_Blobstore – Anentropic Feb 29 '12 at 17:59
2

We need to mention that since some time ago you can use the experimental feture of the blobstore to write files.

Then you can serve the file as a download using a nice BlobstoreDownloadHandler

Jimmy Kane
  • 16,223
  • 11
  • 86
  • 117
1

I would stick with the first option. Preparing the blob will require some additional coding, but blobstore API allows for serving byte ranges of the file:

http://code.google.com/appengine/docs/python/blobstore/overview.html#Serving_a_Blob

You will not need to implement serving file chunks yourself.

Piotr Duda
  • 1,767
  • 11
  • 12
  • Found some related questions: http://stackoverflow.com/questions/2149198/directly-putting-data-in-appengines-blobstore http://stackoverflow.com/questions/680305/using-multipartposthandler-to-post-form-data-with-python And some code here: http://www.google.com/url?q=http://www.aleax.it/blosto.zip&ei=oLI0TfmjN5KwhAfhqZDeCw&sa=X&oi=unauthorizedredirect&ct=targetlink&ust=1295301032907785&usg=AFQjCNHb48T_AsYZVQ8zaaX00YkpNf6cjw – John Carter Jan 17 '11 at 21:30
0

There is some approach you are overlooking, although I'm not sure whether it is that much better:

Split the data into many 1MB chunks, and have individual requests to transfer the chunks.

This would require cooperation from the outside applications to actually retrieve the data in chunks; you might want to use the HTTP Range header to maintain the illusion of a single file. Then have another object that keeps the IDs of all the individual chunks.

Martin v. Löwis
  • 124,830
  • 17
  • 198
  • 235
  • Good point, forgot about that approach, but sadly I don't think it'll work for this application. – John Carter Jan 18 '11 at 00:07
  • The maximum response size is actually 10MB, although if the files are 10-100MB this is a minor detail; you'd just need to split into larger chunks. – Wooble Jan 18 '11 at 13:37