3

I'm required to process large files, up to 2GB, in GAE (I'm using Python). of course I'll be running the code on a backend, however since a local storage isn't available the data will need to be in memory.

is there a file descriptor like wrapper for boto or other cloud storage supported protocol? or other recommended technique?

Thanks, Shay

Shay
  • 1,245
  • 7
  • 14
  • You can use boto to access S3. Check out the link here: http://stackoverflow.com/questions/3948391/is-is-possible-to-read-a-file-from-s3-in-google-app-engine-using-boto –  Apr 21 '13 at 07:31

2 Answers2

1

You maybe interesting in "Google Cloud Storage Python API Overview". It works like a regular local file. I've used it on my project and didn't encountered any problems with it.

Vladimir Obrizan
  • 2,538
  • 2
  • 18
  • 36
  • Unfortunately, the mentioned "Files API" is deprecated.The official successor, GCS, does not behave like a local file as it lacks support for modifying/appending. Any updated solution? – jsphpl Nov 04 '15 at 12:51
  • @jsphpl, here is the updated link: https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/ – Vladimir Obrizan Nov 05 '15 at 12:12
  • thanks for the link. I don't know how to say it differently, so please read my initial comment again. – jsphpl Nov 05 '15 at 14:24
  • @jsphpl, here is the example how you can work with arbitrary size files with Google Cloud Storage with Python from Google App Engine. https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/getstarted I use this approach for my app and it works. And it does work for appending. Did you try it? Did it work? If not, why? – Vladimir Obrizan Nov 06 '15 at 10:32
0

The data file doesnt "need to be in memory" and if you try that you will run oom. If you can process it sequentially open it as a filestream. Ive done that with blobstore, should be similar

Zig Mandel
  • 19,571
  • 5
  • 26
  • 36