What is the exact datastore (and memcache) entity size limit, and how is it calculated?

Question

The limit is 1MB, according to the docs, which I assumed means 1024**2 bytes, but apparently not.

I've got a simple function which stringifies large python objects into JSON, splits the JSON string into smaller chunks, and puts the chunks (as BlobProperty) and a separate index entity to the datastore (and memcache, using ndb). And another function to reverse this.

First I tried splitting into 1024**2 chunks but the datastore complained about it. Currently I'm using 1000**2 and it has worked without errors. I could've answered my own question already here, if it wasn't for this comment by Guido, with code that splits into 950000 bytes chunks. If Guido does it, it must be for a reason I figured. Why the 50K safety margin?

Maybe we can get a definitive answer on this, to not waste even a single byte. I'm aware of Blobstore.

score 2 · Answer 1 · answered Oct 29 '12 at 16:55

2

The limit is 1MB - that is, 2²⁰ bytes - but that limit is for the encoded version of the entity, which includes all the metadata and encoding overhead.

One option is to leave some wiggle-room, as you're doing. Another is to catch the error and subdivide chunks if necessary.

If you're having to split stuff up like this, however, the Blobstore may be a better choice for your data than the datastore.

answered Oct 29 '12 at 16:55

Nick Johnson

100,655
16
128
198

Is it possible to calculate the overhead and metadata? Is the entity completely "packaged" before being put to the datastore, or does the datastore itself add additional overhead to it. I'm aware of Blobstore, but it requires additional managment, because the entities are overwritten frequently. It's only few MB anyway. – Oct 29 '12 at 18:00

What is the exact datastore (and memcache) entity size limit, and how is it calculated?

1 Answers1