Amazon QLDB have any scaling/performance limits?

Question

The main Amazon QLDB page says

QLDB is also serverless, so it automatically scales to support the demands of your application.

However, even products like DynamoDB—with practically unbounded automatic scaling—have some scaling limits. (For example, DynamoDB has a max of 3k RCU per partition key.)

I’m trying to find out the scaling/performance limits of QLDB. Is there any max TPS or max throughput per key, table, ledger, or account? Is there a maximum storage size per table or ledger or account?

As of October 2019, there’s no mention of any scaling limits on the QLDB Quotas and Limits page.

The QLDB FAQ page says,

Amazon QLDB can execute 2 – 3X as many transactions than ledgers in common blockchain frameworks.

That’s a start, but it’s not very helpful because “2-3X” is a relatively wide range, and they haven’t specified which blockchain frameworks they consider common.

Has anyone found any info on this (in the documentation, in AWS blog posts, from a deep dive session, etc) whether or not there are any such limits?

Marc · Accepted Answer · 2019-10-31T18:41:05.143

As you note, with any system there are limits. The only true answer to your question would require benchmarking your use case to see what numbers you get. I don't want to mislead you!

That said, I can help you understand some QLDB fundamentals which will help you build a mental model for how the system should behave for different workloads.

The first concept to understand is the document-revision model. In QLDB, documents are inserted and then updated (revised) and then deleted. Each document has a QLDB-assigned UUID and each revision has a QLDB-assigned (strictly monotonically increasing and dense) version number. Documents can be revised by issuing transactions (sending PartiQL statements) over a QLDB session.

Next, transactions. Transactions typically read some state and then either continue or abandon. For example, if you are building a banking application with the use case of transferring money from Mary to Joe, the transaction may be "read the balance of Mary", "read the balance of Joe", "set the balance of Mary" and "set the balance of Joe". In between, your application can enforce constraints. For example, if it determines that Mary's balance is less than the transferred amount, it would abandon the transaction. If this transaction succeeds, two new revisions are created (one for the new bank account of Mary, and one for Joe).

The next concept is Optimistic Concurrency Control (OCC), which is explained at https://docs.aws.amazon.com/qldb/latest/developerguide/concurrency.html. When you attempt to commit a transaction, QLDB will reject it if another transaction interfered with the one you are attempting to commit. For example, if another withdrawal was made from Mary's account (after you read the balance), your commit will fail due to an OCC conflict, allowing you to retry the transaction (and re-check that Mary still has enough money). Thus, the nature of your transactions will affect your performance. If you are reading account balances and then producing new balances based off the read, then you will have lower throughput than if you are creating new accounts or changing accounts to random amounts (neither of which require any reads).

The fourth concept is that of the Journal. QLDB is a "Journal first" database: all transactions are first written to a distributed log which is then used to update indexed storage. The QLDB architecture abstracts the physical log implementation for you but does expose the concept of a "strand", which is a partition of the Journal. Each strand has a fixed amount of capacity (new revisions per second). QLDB currently (late 2019) restricts each ledger to a single strand.

Putting this together, hopefully I can help you with your questions:

Max TPS. The theoretical upper-bound is the max TPS of a single strand. There isn't a single fixed number, as various factors may influence it, but it is many thousands of TPS.
Max TPS per document. This will never exceed the max TPS, but will be bound more by OCC than anything else. If you are simply inserting new documents (no reads) you will have zero OCC conflicts. If you are reading, you will be bound by the time it takes us to update our indexed storage from the Journal. 100 TPS is a good starting point.
Max per table. There are no per-table limits, other than those imposed by other limits (i.e. the per-document limit or the strand limit).
Max per account. We have no account-wide limits on the "QLDB Session" API. Each ledger is an island.
Max size per table, ledger or account. There are no limits here.

A note on sessions: we have a default limit of 1500 sessions to QLDB. Each session can only have 1 active transaction, and each transaction takes some amount of time either due to PartiQL query time, network round-trips, or work your application is doing with results. This will impose an upper bound on your performance. We do allow customers to increase this limit, as described at https://docs.aws.amazon.com/qldb/latest/developerguide/limits.html.

With regards to the other part of your question (documentation, examples and learning materials), I can provide some information. QLDB was released last month, so re:Invent 2019 is the first opportunity we have to engage with customers and gain direct feedback on where developers need more help. We gave a 300-level talk at re:Invent 2018 and will do another one this year. I will be giving a "Chalk Talk" on our Journal-first architecture and will cover some of these concepts. The session will be recorded and uploaded to YouTube, but the Chalk Talks require you to be there in person. But either way, this is just one of many opportunities we have to engage and better explain the QLDB architecture, benefits and limitations. Feel free to keep asking questions and we'll do our best to answer them and improve the quality of documentation available. In terms of the "2-3x claim", this number was determined by building real-world use cases (such as the banking example) against blockchain frameworks and QLDB, and distilling those learnings into a single number. We believe the centralized nature of QLDB can provide many benefits if one doesn't need a distributed ledger, and performance is one of them. If you have specific use cases where QLDB is not faster than the same use case on a blockchain framework, we'd love to hear about those.

Hi it's been a bit difficult to find information about this new QLDB. Perhaps you might be able to answer this question. Currently we are about to build a new service based on event-sourced architecture. Do you think QLDB could be a good way to keep a permanent ledger or event log for the kinesis stream that we pipe events through? I'm concerned about query performance, but the idea of an encrypted, permanent storage type appeals to the needs of an event-sourced system in my industry. Should I just go with a standard dynamodb instead or does this sound reasonable? — user3379893, Nov 07 '19 at 15:27
It's going to be really hard to give you a concrete, useful answer without a detailed use case. But I'll do my best to give you my perspective, concisely. — Marc, Nov 07 '19 at 18:33
Let's consider a system where you master your data in another database such as DynamoDB. You could use change streams to replicate changes into QLDB. Now, you get to leverage the "complete and verifiable" properties. For example, you could hand a change event out and later come back and ask QLDB to prove to you that the contents of the event are authentic (from when the data landed in QLDB). This is pretty straight forward and a low risk way to get started with QLDB. — Marc, Nov 07 '19 at 18:52
Mastering your data in QLDB lets you use transactions with PartiQL. Unlike normal SQL-based transactions, PartiQL has support for nested content (documents) which can significantly simplify how you interact with the database. QLDB currently (late 2019) has limited indexing functionality, but if you're familiar with proper use of DynamoDB, I think you'll be able to play within the restrictions. — Marc, Nov 07 '19 at 19:00
In terms of getting data out of QLDB, you have a couple of options. First, you can do what you normally would do and poll the database for updates. This is simple, but you can miss changes. Next, you can use QLDB's history functionality to ensure you don't miss changes. Finally, you can use the Export functionality to dump the Journal into S3 and do whatever you want. I'm interested to hear from you if any of these meet your requirements and, if not, what would. QLDB is a "Journal first" database and this architecture makes it really easy for us to get data out. — Marc, Nov 07 '19 at 19:03
Final comment: I'm sorry you are finding it difficult to find information about QLDB. In 3 days time we'll be celebrating our 2 month birthday. We're hungry for customer feedback both in terms of functionality you want but also in terms of where our documentation and examples are lacking. Feel free to ask questions on the QLDB forums (https://forums.aws.amazon.com/forum.jspa?forumID=353), StackOverflow, or ping me on twitter (@marcbowes). As I noted on another question, this year's re:Invent will have some sessions that will provide insight both into how QLDB is built and how best to use it. — Marc, Nov 07 '19 at 19:07
Thank you very much for the responses Marc, I like your idea of replicating changes into QLDB as a starting point. My primary concern with Dynamo is with regards to the hot key/partition problem, and generally any time-based events. This is a really good write-up of the problem: https://github.com/alessandrobologna/dynamodb-event-store. — user3379893, Nov 11 '19 at 11:51
Totally get that it's a brand new product, I will actually be attending re:Invent this year. I will definitely change my itinerary a bit to attend some of the QLDB sessions, provided that they're not already filled up as most of the interesting sessions usually are =) — user3379893, Nov 11 '19 at 11:51
Once we get more insight into the scale of our problem in the coming weeks, I would love to reach out to you (probably on Twitter) and maybe have you take a look at any work we do with QLDB! — user3379893, Nov 11 '19 at 11:52
Sure, happy to help. Try attend my Chalk Talk; there will be time for questions at the end. Monday @ 15:15. — Marc, Nov 11 '19 at 19:58
@Marc : Can you explain what a session is? Is it true that QLDB can have only 1500 concurrent transactions at a time? Also does QLDB have support for cross table transactions? — dashuser, Jan 13 '20 at 19:08
A session is conceptually like a login. Once you have a session, you can start transactions. Each session can only do 1 transaction at a time. A transaction can operate on any number of tables. Transactions execute concurrently (across sessions) and any conflicts are handled by optimistic concurrency control (OCC). The 1500 limit is something that can be changed if you send a limit increase. However, we find that 1500 is more than enough (by about 3x) to max out the throughput of the single Journal strand we offer today. If you need more info, please post a new top level question :). — Marc, Jan 13 '20 at 20:39
Has anyone seen a performance stress test? General sense of maximum transactions per second that QLDB may be able to offer. Marc? — VinLucero, Feb 05 '22 at 07:14
“Many thousands” as in able to keep up with Visa at 25,000 TPS? — VinLucero, Feb 05 '22 at 07:16
@Marc or someone else could you please check this question? https://stackoverflow.com/questions/73780825/lambda-random-long-execution-while-running-qldb-query — Thiago Scodeler, Sep 21 '22 at 10:32

Amazon QLDB have any scaling/performance limits?

1 Answers1

Linked