1

I need to read a 5Mb file on Google App Engine (Python 2.7) and use it frequently.

As read a file in GAE is not difficult: Read a file on App Engine with Python?

The difficult part is storing it somewhere so I can access it as fast as possible frequently. It is 5MB so it exceeds the 1MB datastore item limit.

I consider to use Blobstore, but I am afraid it is not fast enough. Is reading Blobstore faster than reading a file?

I am thinking about put the whole file in memcache. Is it possible? Is the memcache big enough to store a 5MB file?

Just like on a computer, I need to put this file in memory not hard disk.

Any suggestions?

Thanks a lot!

Community
  • 1
  • 1
Gaby Solis
  • 2,427
  • 5
  • 21
  • 27
  • Maybe you can store the data in a python file as a base64 encoded string and decode it in init – Pramod Aug 08 '12 at 06:01
  • @Pramod Good idea! As this answer said: http://stackoverflow.com/questions/1462208/is-it-correct-that-you-are-allowed-3-000-files-per-app-engine-app-not-1-000 One file can be as large as 10MB, so it is possible to store all data in the .py code file. But Why base64? Is "decode" necessary everytime I use the data? It is faster without "decode". – Gaby Solis Aug 08 '12 at 06:14
  • Memcache puts a 1MB limit on cached values ([see here](https://developers.google.com/appengine/docs/python/memcache/overview)). Is there no way you can split it into smaller chunks? – Haldean Brown Aug 08 '12 at 06:46

1 Answers1

2

If your file does not change, then you can simply put it in your project directory and have it served as a static file.

Now on to questions:

  1. Blobstore will be fast enough because all requests (blobstore or user code) on GAE go through a transparent cache. You can simply set an appropriate Cache-control header on the blobstore response to have it cached.

  2. Memcache max stored value is 1Mb. Also data in memcache can go away anytime, so you need to store data in a permanent storage anyway. Also, I doubt it would be faster because your frontend instance has to get data from Memcache and then serve it, while Blobstore serving works a bit differently (by intercepting responses and inserting data in body).

  3. IMHO, the fastest would be if data is served via transparent cache.

Also, if you want to serve images, then you might want to use Image Service, as it seem to be faster then Blobstore.

Peter Knego
  • 79,991
  • 11
  • 123
  • 154