10

I'm migrating a very simple mongo DB (couple 100 entries) to Azure Cosmos DB. My app is based on node-js so I'm using mongoose as a mapper. Before it was really simple, define schema, query collection, finished.

Now when setting up a collection in cosmos db, I was asked about partion key and shard key. The first one I could ignore, but the last one was required. Quickly reading-up on that topic and understanding it was kind of partioning (again, which I do not need and want), I just chose _id as shard key.

Of course something does not work.

While find queries work just fine. Updating or insert records fail, below is the error:

MongoError: query in command must target a single shard key

Cosmos db (with the mongo API) was advertised to me as a drop-in replacement. Which clearly is not the case because I never needed to worry about such things in mongo, especially for such a small scale app/db.

So, can I disable sharding somehow? Alternatively, how can I define shard key and not worry about it at all going forward?

Cheers

Stennie
  • 63,885
  • 14
  • 149
  • 175
baouss
  • 1,312
  • 1
  • 22
  • 52

3 Answers3

1

You could create a CosmosDB collection with maximum fixed storage of 10GB. In that case the collection will not have to be sharded because the storage is not scalable and you will not receive errors from CosmosDB. However, due to the minimum throughput of 400 you might have slightly higher costs.

Roman Gherta
  • 821
  • 2
  • 15
  • 27
  • Concerning this situation is 100 RU minimum per collection and 400 for the database. – Adrian Feb 06 '20 at 16:43
  • Last time I checked yes. An example... A database has minimum 400 RUs , any collection starting with the 5th adds 100 RUs. So for a database with 10 collections you will pay a minimum throughput of 1000 RU. – Roman Gherta Mar 03 '20 at 17:53
0

1.can I disable sharding somehow?

Based on the statements in Mongo official document,it can't be implemented.

MongoDB provides no method to deactivate sharding for a collection after calling shardCollection. Additionally, after shardCollection, you cannot change shard keys or modify the value of any field used in your shard key index.

So,you can't deactivate or disable the shard key.

2.Alternatively, how can I define shard key and not worry about it at all going forward?

According to this link,you could set the shard key option in Schemas when you use insert/update operation on your collection.

new Schema({ .. }, { shardKey: { tag: 1, name: 1 }})

Please note that Mongoose does not send the shardcollection command for you. You must configure your shards yourself.

BTW, set _id as shard key might not be a appropriate decision. You could find some advices about choosing shard key from here.If you want to change the shard key or just remove the shard key,please refer to this case:How to change the shard key

Jay Gong
  • 23,163
  • 2
  • 27
  • 32
  • Sorry for getting back to this so late. I further read, that the shard key must be present on all insert and update operations. This clearly would make the_id not an ideal candidate, since in my case it is automatically created on insert. – baouss Mar 07 '19 at 07:28
  • @baouss Yes,agree with you. So you have no choice to use `_id` as shard key continually? – Jay Gong Mar 07 '19 at 07:33
  • @baouss Thanks.baouss. – Jay Gong Mar 07 '19 at 09:03
  • So now I chose an different field as shard key. Not _id. When making a PUT request on the complete resource (small, 5 fields) including everything and changing a non-shard, non-"_id" field it still gives me the error message that the operation failed "query in command must target a single shard key". I am using Mongoose's findByIdAndUpdate method – baouss Mar 08 '19 at 22:58
  • Trying to compensate for shardKey in the mongoose schema as suggested above does not work: shardKey accepts only boolean: "Type '{ tag: number; name: number; }' is not assignable to type 'boolean'" – baouss Jun 05 '19 at 06:44
-2

It is required that each collection has a partition key and there is no way you can disable this requirement. Inserting a document into your collection without a field that targets the partition key results in an error. As stated in the documentation:

The Partition Key is used to automatically partition data among multiple servers for scalability. Choose a JSON property name that has a wide range of values and is likely to have evenly distributed access patterns.

Quantum
  • 190
  • 3
  • 16