0

I am using an upsert query to insert/update a document in a collection, however the query not always reliably upserts. It sometimes inserts another document when it should have updated. You can see the query below - Updated query after Alex pointed it out -

db.test.update(
      { departmentID: "1",
        storeID: "1",
        customerID: "1"},
      {  
         $set: { 
           name: "Andy",
           rating: 1,
           score: 1
         },
         $setOnInsert: { 
             departmentID: "1",
             storeID: "1",
             customerID: "1"
         }
      },
      { upsert: true }
   );

This query mostly works, but sometimes what ends up happening is when the above query is run in quick succession(gap of 0.004 seconds), the query ends up inserting 2 documents with departmentID-1, storeID-1 and customerID-1, whereas it should have inserted only one document and updated on the second instance.

  • Having a unique index on departmentID, storeID and customerID can be used but I am not sure of 2 things - 1) Weather it will degrade the performance of the upsert queries. 2) Weather it will work or fail in the same conditions – Shrey Gupta Apr 17 '19 at 09:51
  • 1
    And which document to you think should have updated and what was the document inserted? Point being if you want to make a claim then it needs to be reproducible. We'd like you to show this reproducible case, since that is not how "upserts" work for the rest of us here. – Neil Lunn Apr 17 '19 at 09:55
  • @NeilLunn the documents that are created are - {departmentID: "1", storeID: "1", customerID: "1", name: "Andy", rating: 1, score: 1} {departmentID: "1", storeID: "1", customerID: "1", name: "Andy", rating: 1, score: 1} So the first document should have been updated, instead of insertion of a new document. – Shrey Gupta Apr 17 '19 at 10:18
  • 1
    Show a reproducible case please. I'm asking you to prove that what you claim happens can actually happen. – Neil Lunn Apr 17 '19 at 10:21
  • @NeilLunn how do you recommend I do that? – Shrey Gupta Apr 17 '19 at 10:29
  • Possible dupe of https://stackoverflow.com/questions/37295648/mongoose-duplicate-key-error-with-upsert – JohnnyHK Apr 17 '19 at 20:05
  • @JohnnyHK ***An upsert that results in a document insert is not a fully atomic operation*** Your comment – Ashh Apr 18 '19 at 05:43
  • @NeilLunn ***Operations are still atomic, and even more so in modern releases*** which one is correct? – Ashh Apr 18 '19 at 05:43

1 Answers1

1

https://docs.mongodb.com/manual/reference/method/db.collection.update/#use-unique-indexes is quite clear about it:

To avoid inserting the same document more than once, only use upsert: true if the query field is uniquely indexed.

Update

There are few other problems with the query, thanks to Neil Lunn pointing it out. The replace document syntax used in the query results with following document being inserted:

{
    "_id" : ObjectId(....),
    "name" : "Andy",
    "rating" : 1.0,
    "score" : 1.0
}

This document doesn't have any departmentID, storeID, or customerID and won't match the filter on consecutive calls of the update query. I.e. Running the same query 5 times on an empty collection will result with 5 documents inserted.

The correct query should use update individual fields syntax with combination of $set and $setOnInsert:

db.test.update(
   { departmentID: "1",
     storeID: "1",
     customerID: "1"},
   {  
      $set: { 
        name: "Andy",
        rating: 1,
        score: 1
      },
      $setOnInsert: { 
          departmentID: "1",
          storeID: "1",
          customerID: "1"
      }
   },
   { upsert: true }
);
Alex Blex
  • 34,704
  • 7
  • 48
  • 75
  • 1
    I read the question as claiming that "upserts" create duplicate entries, and not about "unique indexes". I would suggest that the issue is in fact the "duplicate" documents are not in fact **identitical** on the predicate in the statement. Therefore the same predicate on a unique index would not change a thing. – Neil Lunn Apr 17 '19 at 09:57
  • @NeilLunn if "the above query is run in quick succession" I guess the documents are identical, no? – Alex Blex Apr 17 '19 at 10:03
  • 1
    Um No. Operations are still atomic, and even more so in modern releases. So it's not really possible for there to be "no found document" on two operations. An index would force an error, but it's really up to the OP to prove otherwise. Odds are that `"1"` sometimes get sent as `1` more than some random race condition being presented. – Neil Lunn Apr 17 '19 at 10:06
  • @AlexBlex will adding unique index degrade the performance of mongoDB upsert? – Shrey Gupta Apr 17 '19 at 10:22
  • @NeilLunn I am not sure I understand the second part of your argument. – Shrey Gupta Apr 17 '19 at 10:22
  • Lol, @NeilLunn you are right, the query is just wrong. I have updated the answer. The warning still exists in the official documentation. – Alex Blex Apr 17 '19 at 10:34
  • @ShreyGupta it will improve it. Without the index upsert does COLSCAN to filter documents. The index will allow to do IXSCAN which is way faster. – Alex Blex Apr 17 '19 at 10:42
  • @AlexBlex I am using the query that you just pointed out, did not write the exact query. An index on the same columns exists, but the index is not an unique index. Wouldnt upserts with unique indexes cause some sort of locking to prevent duplication? – Shrey Gupta Apr 17 '19 at 11:35
  • @ShreyGupta no, but you really should re-think how you ask questions. We don't have access to your code. Answers are based only on information provided in the question. If you lie in your question you get false answers. – Alex Blex Apr 17 '19 at 11:42