0

i retrieve a list of dict which i try to insert in the database with self.db.bugzilla.insert_many(docs). However the documents looks to get double with each execution.

I haven't created indexes as it was suggested in other solutions to prevent publications. None of the solutions from mongodb: insert if not exists Insert_many dataframe into MongoDB and skipping duplicates issues and How to insert_many with txMongo avoiding duplicated keys? seems to fit in my case.

I read the documentation too but i still cant figure it out. I have already tried with insert_one and update_many. Can maybe someone help?

An example of the data looks like this

[
  {
    "assigned_to": "test@email.com",
    "creation_time": "20211208T14:48:10",
    "id": 1193534,
    "assigned_to_detail": {
      "email": "test@email.com",
      "id": 52497,
      "real_name": "john John",
      "name": "test@email.com"
    },
    "resolution": "FIXED",
    "status": "RESOLVED"
  },
  // {...},
  // ...
]

Then, once i run the script a few times i can see the duplicated entries using db.bugzilla.find({assigned_to: 'test@email.com'}). The only difference is the added _id.

rickhg12hs
  • 10,638
  • 6
  • 24
  • 42
b10n1k
  • 567
  • 5
  • 21
  • you should clarify better desired behavior, what do you want to do with duplicated records? – dododo Jun 05 '23 at 13:15
  • Perhaps if you show (portions of) example `dict`'s (documents), and how you identify duplicate documents, someone may be able to show a working MongoDB operation to update your collection. – rickhg12hs Jun 05 '23 at 16:45
  • description updated. – b10n1k Jun 05 '23 at 21:39
  • @dododo when the dictionary is exactly the same i dont want them to be inserted. just insert the ones that have id which is not in the db. ofc i will have to update an existing one when one of the fields have changed later on. – b10n1k Jun 05 '23 at 21:43
  • _"when the dictionary is exactly the same"_ Duplicate records depend on every field, not just on, say, `"id"`? – rickhg12hs Jun 06 '23 at 03:46
  • As such {'k1': 'v1', 'k2': 'v2'} should not be added again. but when is added mongo add an _id. i guess thats why creates a new document, right? i cant see how can i prevent that – b10n1k Jun 06 '23 at 16:49
  • Depending on exactly what is considered a duplicate document, you could perform an update with an `upsert` option. – rickhg12hs Jun 06 '23 at 20:30

0 Answers0