4

We are designing an application which will use DynamoDB as storage system.

We identified the different access patterns and after reviewing Global Secondary Indexes documentation, we got stuck on making a decision about which approach to use: Index overloading or having 2 sparse index.

To give more context, our application stores orders, we can have internal or external orders. Based on that, they will be linked to a Customer or a Warehouse:

Data model

As we would like to search by customer and/or warehouse we thought about 2 solutions.

First solution would be, keeping the above data structure, creating 2 indexes on:

  • GSI1 - Customer (PK)
  • GSI2 - Warehouse (PK)

Second solution is to overload another column like:

Index overloading

So only 1 index required: Destination (PK), and queried applies with a prefix.

The question is: "Is there any benefit between index overloading over having 2 different sparse Global Secondary Indexes?" (Cost saving on capacity provisioning, data transport, query times, data complexity...)

Alberto Martin
  • 556
  • 4
  • 8

1 Answers1

2

As I didn't get any answer yet, I'll add my opinion.

There is no big difference between the 2 approaches in both cases all items will end up being indexed and similar attributes stored.


Some benefits I could find are:

Benefits using 2 GSI

  1. Data schema is easier to understand (no overloading)
  2. More flexibility for evolving the schema: if requirements change, an order can be assigned to both a customer and a warehouse.
  3. Capacity to adjust better projections (may not be always applicable, but you may only need 2 fields for Customer access pattern and 3 for Warehouse)
  4. Smaller indexes have greater performance

Benefits using 1 GSI

  1. No need to worry about capacity units, they can be similar to main table. When using 2 indexes, you need to know an estimation of how many records will fall under each of them, otherwise you need to over-provision them.

    Example: If you set 50% RCU and WCU from main table to each of the indexes, but you have 70% orders which are for customers, some requests will be throttled.

In summary, even using 2 indexes allows to get a more precise configuration, it may end up having a higher cost and the need to review index configuration to adjust it to access patterns usage from time to time.

Alberto Martin
  • 556
  • 4
  • 8
  • 1
    it's been over a year since your answer, I am in a similar situation, could you share your thoughts, what did you end up with? any issues or feedback you can share with others, thanks – demsey May 25 '21 at 21:21
  • @demsey sorry for late reply. Finally we decided to use 1 single GSI. The main reason is for simplicity, only 1 configuration, only 1 place to monitor. Anyway decision could be different based on use cases, but I'd only use multiple indexes when there is a strong reason for it like: security, performance, criticality... – Alberto Martin Jul 16 '21 at 06:41