4

What I want to ask is, when we add more nodes to DynamoDB DAX cluster it will distribute the data across nodes and cache capacity will be equal to (number of nodes*node capacity) or more nodes are for availability and load distribution only with capacity as capacity of single node?

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Nilesh Soni
  • 405
  • 5
  • 12

1 Answers1

5

Here’s what the DAX documentation has to say:

A DAX cluster consists of one or more nodes. Each node runs its own instance of the DAX caching software. One of the nodes serves as the primary node for the cluster. Additional nodes (if present) serve as read replicas. For more information, see Nodes.

And then the Nodes link says

You can scale your DAX cluster in one of two ways:

• By adding more nodes to the cluster. This will increase the overall read throughput of the cluster.

• By using a larger node type. Larger node types provide more capacity and can increase throughput. (Note that you must create a new cluster with the new node type.)

So, by adding more nodes, you add more read replicas and the ability to handle more requests per second on the same amount of data. Adding nodes does not increase the total cache size.

You can increase the amount of data in the DAX cache by using a larger instance type for your cluster, or by using multiple DAX clusters for the same tables.

Getting a larger cache through multiple DAX clusters is possible, but a little complicated. You would need to figure out how to partition your read requests to divide them consistently between your DAX cluster endpoints.

Community
  • 1
  • 1
Matthew Pope
  • 7,212
  • 1
  • 28
  • 49
  • while adding node does it get the cache data from other(primary) nodes. any pointer to documentation regrading adding node cache behavior – cad Sep 05 '19 at 15:47
  • "If there are any read replicas in the cluster, DAX automatically keeps the replicas in sync with the primary node." (https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DAX.concepts.html#DAX.concepts.request-processing-read) Though not explicit, this implies that populating data to the replica must happen before the replica starts taking traffic since an unpopulated replica would be out of sync with the primary node. – Matthew Pope Sep 05 '19 at 16:56
  • Thanks Matthew. does dax support secondary indexes like LSI oR GSI. do you have any sample python API to make dax call for GSI – cad Sep 10 '19 at 16:19