0

I have a DB named StarsGallery that includes objects with multiple array fields (each field is an array of customized image objects with hash value for each image) for each object that is inserted we first generate a hash value for the image and then checks that the hash value doesn't exist already (Happens when someone is trying to insert the same picture twice)... and this query iterates all the hash values of all the arrays of each star object...which takes a lot of time (The time is the problem that I'm trying to solve).

to make things more clear this is how object looks like:

starObject = {
id: ObjectId(...),
comedyImages:[{id:..., hashValue:...,},{id:..., hashValue:...,}],
dramaImages:[{id:..., hashValue:...,},{id:..., hashValue:...,}],
animationImages:[{id:..., hashValue:...,},{id:..., hashValue:...,}]
}

Is there a good way to check that I don't insert the same hash value twice? I thought about maybe make hash value unique or put an index on the value (but the index needs to be for a specific item in a specific array... which makes it not so helpful) but I'm not sure what is the best solution...

Thanks in advance! :)

Puneet Singh
  • 3,477
  • 1
  • 26
  • 39
Yoni Elisha ッ
  • 303
  • 1
  • 2
  • 11
  • If the size of the hash value is not a concern, you can use SHA-256 as hash algorithm, chances of two keys generating the same hash would be almost impossible. you can check out this thread for more details https://stackoverflow.com/questions/4014090/is-it-safe-to-ignore-the-possibility-of-sha-collisions-in-practice – lucid May 13 '20 at 08:30
  • Actually Im using it... but I want to prevent a case where someone enters the same picture twice (so it will get same hash) . – Yoni Elisha ッ May 13 '20 at 08:43
  • So, you want the same picture shouldn't be part of any category like comedy, drama etc or shouldn't be part of the same category in any other object? – lucid May 13 '20 at 08:47
  • I think it will only be possible to check that a picture wouldnt be from the same category... – Yoni Elisha ッ May 13 '20 at 08:53
  • You can check that the `hashValue` is not there in any of the three arrays before inserting a new _image object_ into one of the arrays. But, to make sure the image's `hashValue` is unique across the documents, you have to create a unique index on the array field (indexes on array fields are called as [Multikey Indexes](https://docs.mongodb.com/manual/core/index-multikey/index.html)). – prasad_ May 13 '20 at 09:14
  • See this post's answer: [How to set unique constraint for field in document nested in array?](https://stackoverflow.com/questions/61655391/how-to-set-unique-constraint-for-field-in-document-nested-in-array) – prasad_ May 13 '20 at 09:21

1 Answers1

0

Let's assume you are adding a new Image in comedyImages, then you can use below query

Model.update({
  _id: ObjectId(...),
  'comedyImages.hashValue': {
    $ne: 'HashValueOfNewImage'
  }
}, {
  $push: {
    subdocs: {
      name: 'HashValueOfNewImage'
    }
  }
}).then((raw) => {
  // check raw.nModified value, you will know that document is modified or not
  // If it is not modified means the HashValueOfNewImage was already there in comedyImages
}).catch(next);

If you want to check the HashValueOfNewImage in all the array at once, just update the query by adding all the arrays in condition

Model.update({
  _id: ObjectId(...),
  'comedyImages.hashValue': {
    $ne: 'HashValueOfNewImage'
  },
  'dramaImages.hashValue': {
    $ne: 'HashValueOfNewImage'
  },
  'animationImages.hashValue': {
    $ne: 'HashValueOfNewImage'
  }
}, {
  $push: {
    subdocs: {
      name: 'HashValueOfNewImage'
    }
  }
}).then((raw) => {
  // check raw.nModified value, you will know that document is modified or not
  // If it is not modified means the HashValueOfNewImage was already there in 
  // comedyImages or dramaImages or animationImages
}).catch(next);
Puneet Singh
  • 3,477
  • 1
  • 26
  • 39