2

I was currently looking into dynamodb documentation and it is not clear about what hash keys and range keys are and how should they be used.

I just need a basic explanation of what they are and how I am supposed to use them so that I can move forward with using it.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Sagar Acharya
  • 1,763
  • 2
  • 19
  • 38

3 Answers3

9

I like to think of it like this:

  • Every item (row) in a table needs to have a Unique ID (Primary Key)
  • A Primary Key is either:
    • Partition Key, or
    • Partition Key + Sort Key

For example, if you had an Invoices table, then the primary key would the Invoice Number. If you had a Login table, then the primary key would be User ID + Timestamp because one user could have multiple logins.

Behind-the-scenes, the Partition Key is also used to distribute data amongst servers. This is how DynamoDB assures its high speed — when there's more data, it is distributed across more servers.

If a table (such as the Login table) has multiple entries for a given Partition Key (eg User ID), then the addition of the Sort Key ensures the uniqueness of the Primary Key so that the Item can be stored and retrieved quickly.

Hash Key = Partition Key

Range Key = Sort Key

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • So can we have two partition keys in the same table? As we cannot have unique constrain in dynamodb. So can we add a column which we want to make unique as partition key? – Sagar Acharya Aug 09 '19 at 00:56
  • No, each table has a single Partition Key and an optional Sort Key. The unique constraint in DynamoDB is enforced through these keys. – John Rotenstein Aug 09 '19 at 01:00
  • Let's say i have a employees table and have id as the partition key and in the same table, I have an email field which has to be unique. Can I enforce a unique constraint to the email if I already have id as the partition key? – Sagar Acharya Aug 09 '19 at 01:03
  • So is normalization the only option i have? – Sagar Acharya Aug 09 '19 at 01:20
  • 1
    The only unique constraint is the Primary Key (which is either just Partition Key, or both Partition Key and Sort Key). You cannot force any other field to be unique. See also: [Is there a way to enforce unique constraint on a property (field) other than the primary key in dynamodb - Stack Overflow](https://stackoverflow.com/questions/12920884/is-there-a-way-to-enforce-unique-constraint-on-a-property-field-other-than-the) – John Rotenstein Aug 09 '19 at 03:40
2

From DynamoDB Documentation

Hash key is your Partition key (Similar to a Primary Key in SQL). In a table that has only a partition key, no two items can have the same partition key value. - True in the absence of a Sort key

Range key is your Sort key (Referred to as a Composite Key) - makes possible to have the combination of Hash Key + Range Key. The first attribute is the partition key, and the second attribute is the sort key

E.g.:

H1 + R1 -> H1R1 is one Composite Key.

H1 + R2 -> H1R2 is another Composite Key.

For a real world scenario, consider a case where there is one user but has multiple roles.

User Id itself can't be used for all the cases. User-Id + Role-Id makes for a unique composite key.

VKB
  • 410
  • 5
  • 11
  • So can we have two partition keys in the same table? As we cannot have unique constrain in dynamodb. So can we add a column which we want to make unique as partition key? – Sagar Acharya Aug 09 '19 at 00:55
  • you can have either 1 partition key alone or 1 (partition + sort key) combo. this way you know exactly how to keep data unique – VKB Aug 09 '19 at 20:26
1

The hash key and range key (a.k.a. sort key) together form the key for each of the items in your database. But what's the difference between the two key parts?

A "hash key" is mandatory. DynamoDB is a distributed table, and uses the hash key to decide on which of the cluster's node(s) to put this item. In particular, all items with the same hash key end up in the same node.

They don't just end up in the same node - they actually are store there contiguously on disksorted in the order of the second part of the key of the item, which is why this part is called the sort key - or the range key because it can be used to read a range of items between two values of the range key.

Having both parts of the key gives you powerful ways to model your data in a way it can be efficiently retrieved. It is efficient to retrieve a specific item with a specific key (the GetItem operation), or all items that have a specific hash key but a range of range keys (the Query operation). There are a lot of examples in the DynamoDB documentation in how to use both parts of the key.

Nadav Har'El
  • 11,785
  • 1
  • 24
  • 45