0

I am new to CosmosDB and exploring on Partition Key. I understand that partitionKey helps faster retrieval. In my case suppose I have Customer Data which has custId, offerCode, offerId and some other properties. I am planning to keep partitionKey on offerId my question is, while fetching data do I need to fetch data by using offerId for better performance or I can fetch the data by other property from the collection. Does it impact on performance? Below is my schema or items -

{

  "custId":"abc12345",

   "offers":[

  {

     "offerId":"offer123",

     "offerCode":"offerCode1"

  },

  {

     "offerId":"offer123",

     "offerCode":"offerCode2"

  }

  ]

}
ppb
  • 2,299
  • 4
  • 43
  • 75
  • I would suggest taking a bit of time to read through the Cosmos DB docs when it comes to partition key, as the docs do a great job of going over this. But tl;dr you can query on anything you want, but if you don't specify partition key as well, you'd need to search multiple partitions to find the data you're looking for. – David Makogon Mar 01 '21 at 20:13
  • @DavidMakogon, Thank you. I am going through the doc. If I fetch the data by `custId` which is not `partitionKey` so it will search multiple partitions? I am not using `custId` as a `partitionKey` because for each `custId` data always unique and any update happen on that `custId` I am first deleting it and creating new one. – ppb Mar 01 '21 at 20:49

1 Answers1

0

David is an expert in cosmosdb and as @David said, what you need to know is 'partitionkey', here's some doc.

Doc from official. And this one from stack overflow.

In my opinion, if your database won't contains much data(> 50G, physical partition can store up to 50GB data), that means all the logic partition(logic partitions are partitioned by partitionkey) exists in one physical partition, so the query won't across physical partitions, so you could even use item ID as the partition key so that you can ensure evenly balancing RU consumption.

By the way, as far as I am concerned, partition key plays the role of 'group', if you have a large database with plenty of data, and you have several physical partitions indeed, and now fetching with the partition key can help to efficiently find the place because one logic partition will exist together in one physical partition. You also should know that if you need to change your partition key, you need to move your data to a new container with your new desired partition key.

In general, if the data size is small, you even don't need to care the partition key, you can even use ID as the partition key, and fetching data with or without partition key won't affect the performance. If the data size is huge, you need to find a property as the partition key follow the principles below:

enter image description here

Tiny Wang
  • 10,423
  • 1
  • 11
  • 29