Difference between namespace and ancestor in data structure

Question

What could be the diference between

key := datastore.NameKey("user", userID, nil)
client.Put(ctx,datastore.IncompleteKey("session",key),&sessionUser)

and

key :=&datastore.Key{Kind:"session",Parent:nil,Namespace:userID}
client.Put(ctx,key,&sessionUser)

Why would they be different if they both have the same write/read that can cause contention From this article

Cloud Datastore prepends the namespace and the kind of the root entity group to the Bigtable row key. You can hit a hotspot if you start to write to a new namespace or kind without gradually ramping up traffic.

I'm really confuse how should I strut my data because of that, by the way, which of them is faster when reading?

Is there a reason you are not using the `datastore.NewKey` or`datastore.NewIncompleteKey` methods? — chris, Jun 27 '18 at 00:52
neither incomplete key nor nameKey accept namespace in its argument, so that's why I directly use it like that — John Balvin Arias, Jun 27 '18 at 01:01

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

The difference is that the namespace contention corner case you mentioned is just a transient one, equivalent (from the root cause perspective), if you want, with this one:

...

If you create new entities at a very high rate for a kind which previously had very few existing entities. Bigtable will start off with all entities on the same tablet server and will take some time to split the range of keys onto separate tablet servers.

...

The transient lasts only until sufficient tablet splits occur to keep up with the write ops rate. For the case you quoted a gradual traffic ramp-up will give time for these splits to happen before hitting errors, avoiding the issue. Even without a gradual ramp-up - contention may occur only until the splits happen, after which it disappears.

Using an ancestry, on the other hand, raises a permanent problem, of a different kind. All entities sharing the same ancestry are placed in the same entity group and thus all share the maximum 1 write per second per entity group rate. The larger the group the higher risk of contention. Using non-ancestor related entities (with or without namespaces) effectively creates entity groups with a size of one - minimal contention of this type.

So unless you really, really need the ancestry, I'd suggest to try to avoid it if your expected usage patterns leave room for contention.

Side note: that article only touches on the write contention, but you should be aware that contention can occur at read as well (in transactions), see Contention problems in Google App Engine. The entity group size matters in this case as well as a transaction attempts to lock the entire entity group.

is it be good practice to put all user data into the same namespace? or namespace is for organice bigger data — John Balvin Arias, Jun 27 '18 at 03:31
Namespaces are designed primarily for multi-tenancy applications - to partition the data. Note that if you're using them you have to be consistent in their use. For example if you make queries they are contained to one namespace, see https://stackoverflow.com/questions/50071038/are-datastore-indexes-same-across-multiple-namespaces/50078730#50078730 — Dan Cornilescu, Jun 27 '18 at 03:36
is multi-tenancy meaning "multi- rent"? don't undestand the meaning, I translated to my native language but it does not make sense to me — John Balvin Arias, Jun 27 '18 at 04:13
It is similar to that - your app would be serving your customers' services on their behalf and you want their data partitioned separately. See https://cloud.google.com/appengine/docs/standard/python/multitenancy/multitenancy Just like Google serves all GAE apps, without an app interacting with each-other. — Dan Cornilescu, Jun 27 '18 at 13:07

Difference between namespace and ancestor in data structure

1 Answers1