2

Use Case

I've got a collection band_profiles and I've got a collection band_profiles_history. The history collection is supposed to store a band_profile snapshot every 24 hour and therefore I am using MongoDB's recommended format for historical tracking: Each month+year is it's own document and in an object array I will store the bandProfile snapshot along with the current day of the month.

My models:

A document in band_profiles_history looks like this:

{
    "_id" : ObjectId("599e3bc406955db4cbffe0a8"),
    "month" : 7,
    "tag_lowercased" : "9yq88gg",
    "year" : 2017,
    "values" : [
        {
            "_id" : ObjectId("599e3bc41c073a7418fead91"),
            "profile" : {
                "_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
                "tag" : "9YQ88GG",
                "name_normalized" : "example name1",
            },
            "day" : 1
        },
        {
            "_id" : ObjectId("599e3bc41c073a7418fead91"),
            "profile" : {
                "_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
                "tag" : "9YQ88GG",
                "name_normalized" : "new name",
            },
            "day" : 2
        }
    ]
}

And a document in band_profiles:

{
    "_id" : ObjectId("5989a6190f39d9fd70cddeb1"),
    "tag" : "9V9LRGU",
    "name_normalized" : "example name",
    "tag_lowercased" : "9v9lrgu",
}

This is how I upsert my documents into band_profiles_history at the moment:

BandProfileHistory.update(
  { tag_lowercased: tag, year, month},
  { $push: {
        values: { day, profile }
    }
  }, 
  { upsert: true }
)

My problem:

I only want to insert ONE snapshot for every day. Right now it would always push a new object into the object array values no matter if I already have an object for that day or not. How can I achieve that it would only push that object if there is no object for the current day yet?

pedromss
  • 2,443
  • 18
  • 24
kentor
  • 16,553
  • 20
  • 86
  • 144

3 Answers3

1

Putting mongoose aside for a moment:

There is an operation addToSet that will add an element to an array if it doesn't already exists.

Caveat:

If the value is a document, MongoDB determines that the document is a duplicate if an existing document in the array matches the to-be-added document exactly; i.e. the existing document has the exact same fields and values and the fields are in the same order. As such, field order matters and you cannot specify that MongoDB compare only a subset of the fields in the document to determine whether the document is a duplicate of an existing array element.

Since you are trying to add an entire document you are subjected to this restriction.

So I see the following solutions for you:

Solution 1:

Read in the array, see if it contains the element you want and if not push it to the values array with push.

This has the disadvantage of NOT being an atomic operation meaning that you could end up would duplicates anyways. This could be acceptable if you ran a periodical clean up job to remove duplicates from this field on each document.

It's up to you to decide if this is acceptable.

Solution 2:

Assuming you are putting the field _id in the subdocuments of your values field, stop doing it. Assuming mongoose is doing this for you (because it does, from what I understand) stop it from doing it like it says here: Stop mongoose from creating _id for subdocument in arrays.

Next you need to ensure that the fields in the document always have the same order, because order matters when comparing documents in the addToSet operation as stated in the citation above.

Solution 3

Change the schema of your band_profiles_history to something like:

{
    "_id" : ObjectId("599e3bc406955db4cbffe0a8"),
    "month" : 7,
    "tag_lowercased" : "9yq88gg",
    "year" : 2017,
    "values" : {
       "1": { "_id" : ObjectId("599e3bc41c073a7418fead91"),
            "profile" : {
                "_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
                "tag" : "9YQ88GG",
                "name_normalized" : "example name1"
            }
        },
        "2": {
            "_id" : ObjectId("599e3bc41c073a7418fead91"),
            "profile" : {
                "_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
                "tag" : "9YQ88GG",
                "name_normalized" : "new name"
            }
        }

}

Notice that the day field became the key for the subdocuments on the values. Notice also that values is now an Object instead of an Array.

No you can run an update query that would update values.<day> only if values.<day> didn't exist.

Personally I don't like this as it is using the fact that JSON doesn't allow duplicate keys to support the schema.

pedromss
  • 2,443
  • 18
  • 24
  • After reading I'd favour solution 3 but I have no idea how a mongoose schema would look like to realize associative object arrays? I can't rely on addToSet as profiles will always changes (there are object properties which change very frequently on the same day). – kentor Sep 01 '17 at 16:57
  • What about a mix of solution 1 but with the `push` operator? Would that be acceptable for your requirements? – pedromss Sep 01 '17 at 17:06
  • A mix of what solutions do you mean? I meant I can't rely on the addToSet because the profile I am inserting will be different. So that wouldn't prevent me from inserting multiple profiles for the same day unfortunately. Hence solution 3 is still my favourite unless I didn't understand you correctly – kentor Sep 01 '17 at 17:19
  • Forget the "mix" I misspoke. Solution one is what I meant. I dislike schemas like solution 3. – pedromss Sep 01 '17 at 17:20
1

First of all, sadly mongodb does not support uniqueness of a field in an array of a collection. You can see there is major bug opened for 7 years and not closed yet(that is a shame in my opinion).

What you can do from here is limited and all is on application level. I had same problem and solve it in application level. Do something like this:

  1. First read your document with document _id and values.day.
  2. If your reading in step 1 returns null, that means there is no record on values array for given day, so you can push the new value(I assume band_profile_history has record with _id value).
  3. If your reading in step 1 returns a document, that means values array has a record for given day. In that case you can use setoperation with $operator.

Like others said, they will be not atomic but while you are dealing with your problem in application level, you can make whole bunch of code synchronized. There will be 2 queries to run on mongodb among of 3 queries. Like below:

db.getCollection('band_profiles_history').find({"_id": "1", "values.day": 3})

if returns null:

db.getCollection('band_profiles_history').update({"_id": "1"}, {$push: {"values": {<your new band profile history for given day>}}})

if returns not null:

db.getCollection('band_profiles_history').update({"_id": "1", "values.day": 3}, {$set: {"values.$": {<your new band profile history for given day>}}})

barbakini
  • 3,024
  • 2
  • 19
  • 25
0

To check if object is empty

{ field: {$exists: false} }

or if it is an array

 { field: {$eq: []} }    

Mongoose also supports field: {type: Date} so you can use it instead counting a days, and do updates only for current date.

ASem
  • 142
  • 3
  • 16
  • I haven't even tried it but I wonder why I would be able to define a condition like $exists inside of the values object? – kentor Aug 31 '17 at 15:14
  • Maybe because you want to $push based on some conditon? You dont have to try it, you know, originlly if ask question you not trying the answer – ASem Aug 31 '17 at 17:00
  • Because it doesn't make sense in your answer I am not even passing the profileData I want to push into the values object anymore – kentor Aug 31 '17 at 17:06
  • Because I cant see the schema and I cant see how exactly `band_profiles` beieng inserted and how are you setting your fields I have updated my answer. – ASem Aug 31 '17 at 17:18