0

I have a MongoDB schema design that embeds a duplicate subdocument (same ObjectId as well) in different documents of the same collection:

Document looks like:

{"inbox": 
    {   "_id": ...,
        "conversations": [
             {"_id": ...,
              "messages": [{"_id": ...,
                            "body": ...}]
                         ]
    }
}

In the inbox collection whenever two people have a conversation I push a duplicate of the conversation to both inboxes.

My plan was to keep reads of the inbox simple and allow a multi document update to write a message to all the conversations with same conversation id. It seems to work as expected, but am I missing a downside to allowing duplicate subdocument ObjectId's across different documents within a collection?

Christian Davis
  • 334
  • 3
  • 9
  • 2
    take a look at this: http://stackoverflow.com/questions/5373198/mongodb-relationships-embed-or-reference – Abdullah Rasheed Oct 30 '15 at 18:34
  • @inspired Thanks, that kind of reinforces why I went with this approach, but I'm mostly concerned with the fact that these ObjectId's are identical. I know that Object Id's are supposed to be unique, so I'm not sure what the implications are there. – Christian Davis Oct 30 '15 at 18:39
  • 1
    It will be necessary sometime to have duplicate Object Id's in your documents if multiple references to exist. I can't think of a reason why subdocuments with the same ObjectId in this case would be harmful. Keep in mind the potential growth when embedding with this style. There is info on sharding here and it discusses the model used for messaging:http://blog.mongodb.org/post/65612078649/schema-design-for-social-inboxes-in-mongodb – Abdullah Rasheed Oct 30 '15 at 19:22
  • @ChristianDavis "Thanks, that kind of reinforces..." If I were you, I'd be coming to the exact opposite conclusion. Your schema is very complicated. It will be INCREDIBLY difficult to keep consistent and the queries will be difficult. As the answer in that link states, "it's more an art than a science". It sure is, but the only "scientific" principle I live by in mongo is never, ever, ever put an array inside an array. It's the grim reaper of schema designs. – chrisbajorin Oct 30 '15 at 20:37
  • @cdbajorin I'm not sure what's complicated about a 2D array, or which queries would be difficult. I have inbox.find(inbox.id), and inbox.update(conversation.id, push(msg)). Am I missing something? – Christian Davis Oct 30 '15 at 21:36
  • 1
    @ChristianDavis It's a 3-part reason. First, the [positional operator](http://bit.ly/1NH5MyL) is limited to a depth of 1, so any updates to specific items in the deeper array are near impossible. Second, having the same data in multiple places leads to inconsistency. It's just bound to happen. Having checks and cron-like jobs to manage eventual consistency then becomes more difficult based on the first point. Lastly, documents have size limits (16mb). with two arrays of potentially infinite size ([inbox?](http://i.imgur.com/9fRREgJ.png)), you'll hit it at some point. It doesn't scale. – chrisbajorin Oct 31 '15 at 02:04
  • 1
    @ChristianDavis To finish my overly long reply, your document could easily be switched into 3 collections with conversations having a pointer to an inbox, and messages having a pointer to conversation (and inbox if needed). Properly indexed, your queries would likely stay the same, and you remove all the scaling anxiety people like me see when we view an array :) The things I'll put in an array are generally enumerable with a clearly defined limit e.g. siblings, last 5 passwords, languages spoken – chrisbajorin Oct 31 '15 at 02:17
  • @cdbajorin Thanks for the feedback. This isn't an actual inbox, so I wasn't worried about scaling. Max 31 conversations open at a time, at which point they're moved to an "archived" collection, and only a small amount of messages are exchanged. I was considering the message array to be immutable, but I didn't realize I literally would not be able to access them with the position operator. I'll probably separate them at least into two separate collections, maybe 3. Thanks! – Christian Davis Oct 31 '15 at 17:32

0 Answers0