20

The whole reason why DynamoDB is fast and scalable is based on the fact that it is eventually consistent. But at the same time, it comes with this ConsistentRead option for operations like get, batchGet, and query which helps you make sure that the data you are reading is the latest one.

My question is about the update operation. First of all, it does not have the ConsistentRead option (one reason would be, update is not a read!). But at the same time, you can update a record in an atomic manner with ConditionExpression, like this:

await docClient.update({
    TableName: 'SomeTable',
    Key: {id},
    UpdateExpression: "set #status = :new_status",
    ConditionExpression: '#status = :old_status',
    ExpressionAttributeNames: {
        "#status": "status",
    },
    ExpressionAttributeValues: {
        ":old_status": "available",
        ":new_status": "done",
    },
}).promise()

This will make sure that at the time of update, the old value is available and if it isn't, the operation will fail with an exception thrown. So, in a sense, you can say that update is strongly consistent.

But my question is about a scenario in which you need to make sure the record exists. Let's say that you have one function which inserts a record. And another one that updates the same record (given its id). My concern is what if by the time the update operation is executed, because of eventually consistency of DynamoDB, there's no record matched and the update fails. As said before, the update operation does not come with a ConsistentRead option to make it strongly consistent.

Is this a valid concern? Is there anything I can do to help this?

BinaryButterfly
  • 18,137
  • 13
  • 50
  • 91
Mehran
  • 15,593
  • 27
  • 122
  • 221

3 Answers3

5

There are no strongly consistent updates; strong consistency applies to reads where basically data viewed immediately after a write will be consistent for all observers of the entity.

When your application writes data to a DynamoDB table and receives an HTTP 200 response (OK), the write has occurred (in at least one storage location) and is durable. The data is eventually consistent across all storage locations, usually within one second or less. You can then choose to read this data in an eventually or strongly consistency fashion.

Concurrent writes to the same item should be handled with optimistic concurrency, you can do conditional writes using the DynamoDB Transaction Library (available in the AWS SDK for Java).

If you need to update more than one item atomically, you can use DynamoDB transactions.

DynamoDB transactions provide developers atomicity, consistency, isolation, and durability (ACID) across one or more tables within a single AWS account and region. You can use transactions when building applications that require coordinated inserts, deletes, or updates to multiple items as part of a single logical business operation.

https://aws.amazon.com/blogs/aws/new-amazon-dynamodb-transactions/

CRUD operations are atomic; however, the official documentation says nothing about them being isolated (outside of DynamoDB transactions). In principle, race conditions can potentially occur and conditional updates can return an error.

See more here: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithItems.html

Alternatively, your use case may benefit from DynamoDB global tables which uses “last writer wins” reconciliation between concurrent writes.

Jorge Garcia
  • 2,042
  • 23
  • 25
  • 2
    I believe that this answer is incorrect - that reads done as part of an UpdateItem are in fact strongly-consistent, and DynamoDB Transactions aren't needed just to achieve that (you may need them for other reasons, such as linearizing writes to multiple items, not just one). I wrote an answer to justify why I think this answer is wrong - although the official documentation is unfortunately lacking in this area, I think it contains enough hints and partial answers. – Nadav Har'El Apr 30 '22 at 20:08
  • @NadavHar'El, the answer is correct. Notice the question is specifically concerned about two different operations, one inserting a record (that will eventually be consistent), and another one updating the same record. The reads-before-write and strong consistency semantics apply only within the scope of the UpdateItem operation in relation to the record data available at the time of the update. It makes no assumptions about external writes to the same record. I hope this helps. – Jorge Garcia May 04 '22 at 18:13
  • I still believe the guarantees are stronger than you are making them out to be: Imagine you have an item with a=1 and have two concurrent writes: one unconditional write that writes "a=2" and another conditional write that if a=1, set "a=3". I claim that the two writes are serialized in some order and you always end up with a=2. You seem to claim that they are not serialized, and you can end up with the conditional update seeing the older a=1, wanting to set a=3 but the a=2 happens first and you end up with a=3. – Nadav Har'El May 07 '22 at 07:30
  • In other words, I believe that DynamoDB *serializes* writes, and in whichever order then end up happening, reads that are part of a write can see all previous writes. This means that these reads are "fully consistent" in the same sense as read requests. – Nadav Har'El May 07 '22 at 07:31
  • @NadavHar'El documentation says that CRUD operations are atomic, but it says nothing about them being serialized. In principle, race conditions do not affect unconditional updates. Dynamo performs an "upsert", overwrite when found, and insert when not found. However, conditional updates can potentially create race conditions if update operations are not serialized, and I could not find documentation that supports this serialization. In fact, conditional updates can fail if the condition is not met (due to race condition or otherwise), and it's up to the caller to resolve the conflict. – Jorge Garcia May 09 '22 at 18:58
  • I can't argue with you on what is indeed missing in the documentation - although Amazon's video talks do provide some implementation details missing from the documentation. But more importantly, please consider Amazon's published examples of using conditional updates for **optimistic locking** and similar concurrent update algorithms. If conditional updates were not isolated (i.e., serialized), they would have been completely worthless during concurrent updates. But they are not. They are serialized. And since DynamoDB implements writes on a single "leader" node, it's easy for them to do. – Nadav Har'El May 09 '22 at 20:34
  • Check out the second illustration here: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithItems.html#WorkingWithItems.ConditionalUpdate It demonstrates how **concurrent** conditional writes are in fact isolated. As you noted, Amazon don't clearly document that conditional updates are isolated (serialized). But their examples suggest they believe this to be the case, otherwise, they will be unusable for the purpose they recommend it... – Nadav Har'El May 09 '22 at 20:54
  • There is no documentation supporting the claim that conditional updates are isolated from one another, and not subject to a race condition. However, the documentation clearly says that conditional updates can fail, suggesting that race conditions (among other scenarios) are a possibility that should be taken into account. I will update my answer to clarify this scenario. – Jorge Garcia May 12 '22 at 21:25
  • Jorge, as I said the written documentation is sadly lacking (much of what I inferred I inferred from youtube videos by Amazon developers), but I don't understand how you think Conditional Updates can do what they are claimed to be able to do - and first and foremost to implement "optimistic locking" - if they weren't isolated from each other. How would the optimistic locking pattern even work if conditional writes weren't guaranteed isolated? – Nadav Har'El May 13 '22 at 09:01
  • That's precisely the point of optimistic locking, transactions can complete without acquiring locks, or waiting for other transactions' locks to clear, at the expense of having to be rolled back and retried if a conflict is found on commit. Isolation seem to be only guaranteed by DynamoDB transactions. – Jorge Garcia May 16 '22 at 04:03
1

This question has been asked here many times in the past, see for example

Are dynamodb update expressions strongly consistent?

Are DynamoDB conditional writes strongly consistent?

The above accepted answer more-or-less suggests that in reads that happen as part of writes all bets are off, and you can't trust them to be consistent - so you need to use the new "DynamoDB transactions" feature. But I believe that the conclusion in that answer is wrong. The new "DynamoDB transactions" are needed when you have a transaction which needs to isolate writes to several different items, and to safely support failed non-idempotent writes. But in the single-item updates supported by UpdateItem, I believe the reads-before-write involved are in fact strongly consistent:

Unfortunately, DynamoDB's documentation isn't completely clear about this, but here is some evidence from DynamoDB's documentation to back up my belief:

  1. There are several reasons why an UpdateItem might need to read the old value of the item (a so-called read-before-write) - and the question is about the consistency of these read. One of the reasons to read is an UpdateExpression (e.g., an update can increment an attribute), another is a ConditionExpression (an update can be conditional on an old value of an attribute), and a third case is ReturnValues - the user asking to get the old value of the item. Even though the UpdateItem documentation is not clear about the consistency in the first two cases, it is very explicit about the consistency in the third case: "The values returned are strongly consistent.". I don't see any reason why the read-before-write in this case, but not others, would be strongly-consistent, so I believe it is strongly-consistent in all three cases.
  2. Several presentations of DynamoDB internals from DynamoDB explain how it works internally: It keeps three replicas for each piece of data, and that at any time one of the three is considered a "leader"; Writes always go to the leader first and succeed when the leader and a second replica have succeeded, eventual-consistency reads go to one of the three replicas at random (so may read from one replica that hasn't received a write yet) but strongly-consistent reads always go to the leader. This means that strongly-consistent reads and writes go to the same replica (the leader), and have the same consistency guarantees. Moreover, the leader linearizes the writes to this item (i.e., does them in some order, not concurrently), so the read-before-write of one write can see everything that was written in previous writes to the same item.
Nadav Har'El
  • 11,785
  • 1
  • 24
  • 45
  • The answer is correct. Notice the question is specifically concerned about two different operations, one inserting a record (that will eventually be consistent), and another one updating the same record. The reads-before-write and strong consistency semantics apply only within the scope of the UpdateItem operation in relation to the record data available at the time of the update. It makes no assumptions about external writes to the same record. I hope this helps. – Jorge Garcia May 04 '22 at 18:14
  • Update (which I think strengthens my claim that my answer is the correct one): Amazon just published, 10 years after the fact, a paper on DynamoDB: https://www.usenix.org/system/files/atc22-vig.pdf. One of the things it says: "Only the leader replica can serve write and strongly consistent read requests.". This is why write operations (such as UpdateItem) and strongly-consistent read requests can be serialized on this single node. – Nadav Har'El Jul 13 '22 at 07:13
-1

All of the answers above are incorrect. There is no write strong consistency.

UpdateItem will return when the leader node of the replication group is acknowledged once a quorum of peers persists the log record to their local write-ahead logs. This means that ONLY the write-ahead log is being replicated to the log node, not the actual storage node. Therefore UpdateItem doesn't guarantee successful writes in the replication group to all storage nodes.

stanleywxc
  • 28
  • 1
  • 1
    Downvoted for being a low-effort response. If you're going to challenge all of the other responses as being wholly incorrect, then back it up with more than a few sentences. – erstaples Apr 06 '23 at 18:06