2

I would like to assign value to "_id" field of newly inserted document. The ID field must start from 0 and cannot skip a number. The insertion can only happen if there isn't a document containing duplicated content.

Currently this is what I have.

db.system.js.save({
 _id: "nextId", 
 value: function (x) {
    return db.counters.findAndModify({
      query:{_id:x}, 
      update:{$inc:{value:1}}, 
      new:true
    }).value;
 }
})
db.loadServerScripts()  // make the nextId available directly or via $where clause
db.counters.insert({_id: "_id", value: 0})
db.test.update(
    {"value": "abc"},
    {$setOnInsert: {"_id": nextId("_id"), "value": "abc"}},
    {upsert: true}
)

Everything works well if there are no document containing duplicated "value" field. When there are duplicate "value" field, nextId("_id") javascript function is still called even though it is inside $setOnInsert tag. Therefore, the observed incorrect behavior here would be some "_id" numbers will be consumed/wasted but cannot be assigned to valid new documents. The correct/intended behavior behavior would be the case where nextId() is called only when a new document is actually inserted.

Question: Is there a workaround to achieve the correct/indended behavior in MongoDB?

MongoDB version: v3.4.10 (local installation) and v3.4.15 (mlab MongoDB hosting)


Note: I have given up on this issue and decided to relax my requirement of "cannot skip a number" requirement. The modified requirement is to use as few Ids as possible globally while ensuring that the used Ids has as low values as possible. After this relax requirement change, I have devised a better and scale-able solution.

1) for each client, reserve a modest amount of ids. The result would be:

db.counters.findAndModify({
    query: {_id:"_id"},
    update: {$inc: {"_id_beg": 10, "_id_end": 10}},
    new: true
}) 

2) for each client's insertion, use the following

db.test.findAndModify({
  query: {"value": <input value>},
  update: {$setOnInsert: {"_id": <use lowest reserved id>, "value": <input value>"}
  upsert: true,
  new: true
})

3) for each client's insertion response, the returned value from findAndModify contains the nextId. If nextId is the same as the lowest reserved id, it means that it is used. If it's different, it means another client inserted the same value, we can use the reserved id for later insertion

There are some case specific optimizations such as unique index on _id, unused values from crashed clients, etc, to allow crash recovery

jackluo923
  • 105
  • 1
  • 7
  • This is very strange, according to MongoDB documentation on $setOnInsert it should not happen (https://docs.mongodb.com/manual/reference/operator/update/setOnInsert/). Question: when the query matches an existing document and it is updated (wasting a `nextId` number), is the `_id` updated with the new value? – cornacchia Jul 05 '18 at 08:19
  • Other question: which version of MongoDB are you using? I can't reproduce your code in my shell (I get a `nextId is not defined` error). According to documentation js functions should be available only in `mapReduce` or `$where` context https://docs.mongodb.com/manual/tutorial/store-javascript-function-on-server/) – cornacchia Jul 05 '18 at 08:23
  • @cornacchia $setOnInsert does indeed work, but not work as expected. The javascript is executed, but the returned value from javascript is not used. Therefore, when the query matches an existing document, the existing document is mot updated and the _id field stays the same. Because the javascript is still executed dispite inside $setOnInsert, newly generated _id is wasted. – jackluo923 Jul 05 '18 at 14:23
  • @cornacchia as a response to your 2nd question, I believe the script is not loaded. You can fix the "not defined" problem by executing db.loadServerScripts(). I have updated the questions with this clarification. https://docs.mongodb.com/v3.2/reference/method/db.loadServerScripts/ – jackluo923 Jul 05 '18 at 14:26
  • Thank you for the clarification, I could reproduce the issue now. It is indeed very strange behavior, I can't find anything about it in the documentation. – cornacchia Jul 05 '18 at 14:50

1 Answers1

0

Looking around I found this: https://stackoverflow.com/a/43695770/10026557

I tested it, looks like it could work:

db.test.update(
  {"value": "bca"},
  {$setOnInsert: {"_id": db.test.find().count() + 1, "value": "bca"}},
  {upsert: true}
)

Additional bonus is not having to rely on another collection and/or on javascript logic.

cornacchia
  • 473
  • 2
  • 9
  • For this solution, I believe there's a chance of duplicated ids if there are multiple documents inserted at a time by different clients due to "Write (_id value) after Read (db.test.find().count() + 1)" issue. – jackluo923 Jul 05 '18 at 15:38
  • I have already given up on this and have since slightly relaxed the requirements. I devised a "free of javascript", higher performance, scale-able, bulkop compatible solution free of concurrency issues (I hope). I have updated the revised solution in the original post. Thanks for the help. – jackluo923 Jul 05 '18 at 16:02