0

I have searched this enough & haven't found the answer yet. So I am asking.

According to the Google cloud datastore doc.

There is a write throughput limit of about one transaction per second within a single entity group.

Now let's just say I have an Entity User & another entity Cars. They have a common parent. So User+Car+Their_Parent is one entity group. Right?

Let's assume In the datastore User & Car have a million instances/rows each.

If I fire a transactional query to update instance/row in the datastore.

My confusion is how many Entity group instances get locked for applying the write limit for Google DataStore?

A. User + Car (Comprehensively with twenty million instances)

B. Just 1 instance of User + Car? (1 user row & 1 car row)

In database parlance, User is an Entity Kind/Table. So does the entire Table/Kind gets locked for 1 write operation or just one instance/Row gets locked for 1 write operation?

If A is the case does that mean for 1 write, all 20 million rows of User+Car entities will be locked? That's crazy. What if I have to update all 20 million rows. If a write operation is updating just 1 row, will 20 million rows require 20 million secs to avoid any contention?

underdog
  • 4,447
  • 9
  • 44
  • 89
  • Well, it's up to you to choose an entity ancestry structure that makes sense for your app. Also - some batching is possible, see https://stackoverflow.com/questions/38277246/datastore-multiple-writes-against-an-entity-group-inside-a-transaction-exceeds/38277520#38277520 – Dan Cornilescu Jul 17 '17 at 23:54
  • @DanCornilescu I understand that Dan. I just want to know how many instance get locked. I've edited my question. All 20 million entity instances get locked for 1 write operation or just one instance get locked which is to be updated. – underdog Jul 18 '17 at 05:37
  • The entire group is "locked". The question you have to ask yourself - is such ancestry realy, really needed? Or it's just very convenient? I was a bit surprised at the beginning as well, but after tweaking a bit my ancestry structure and struggling with data contention I realized these "limitations" (there are others as well) are really drivers for highly scalable designs... – Dan Cornilescu Jul 18 '17 at 11:57

2 Answers2

1

an entity group is a set of entities connected through ancestry to a common root element. The organization of data into entity groups can limit what transactions can be performed:

See the "Python" docs here. Surprised it wasn't somewhere in your Java documentation link

JGFMK
  • 8,425
  • 4
  • 58
  • 92
  • I understand that. I just want to know how many instances get locked. I've edited my question. All 20 million entity instance get locked for 1 write operation or just one instance get locked which is to be updated. – underdog Jul 18 '17 at 05:39
1

Finally found the answer here data store article

enter image description here

In the example above, each organization may need to update the record of any person in the organization. Consider a scenario where there are 1,000 people in the “ateam” and each person may have one update per second on any of the properties. As a result, there may be up to 1,000 updates per second in the entity group, a result which would not be achievable because of the update limit. This illustrates that it is important to choose an appropriate entity group design that considers performance requirements. This is one of the challenges of finding the optimal balance between eventual consistency and strong consistency.

underdog
  • 4,447
  • 9
  • 44
  • 89