2

I've come across three different ways of "joining" collections:

  1. Manually keep a "foreign-key-esk" reference to the collection you wish to join with your target collection
  2. Use DBRefs
  3. Write a series of Map/Reduce functions to maintain the relationship

Can someone explain the benefits of and when I should each one?

My first impression is that Map/Reduce is for large, frequently used sets and the other two are mainly meant for small/fast queries.

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
Adam
  • 3,063
  • 5
  • 35
  • 49
  • 2
    I would say you should never maintain a join through an MR and DBRef is just another version of the first one, there is no difference except DBRef also holds a collection name – Sammaye Jun 04 '13 at 19:47
  • Interesting. Do you know of a way to optimize the references so you wouldn't need to do an additional query from the client to fetch the referenced record? Is there a way to do this on the server before the data set is returned? – Adam Jun 04 '13 at 20:08
  • No, mongodb has no resolution of server-side referencing – Sammaye Jun 04 '13 at 20:12
  • Welcome to the challenges of a document oriented DB system like MongoDB. :) MongoDB isn't a good fit for some types of systems if there are a large number of document requests that are necessary to build a complete "view" of data. If a lot of the data is dynamic, then caching on the middle-tier may not be a good fit like it is in some systems (depends on your scenarios). – WiredPrairie Jun 04 '13 at 20:13
  • 2
    Might look at: http://stackoverflow.com/questions/9412341/mongodb-is-dbref-neccessary/9412613#9412613 and http://stackoverflow.com/questions/6847371/finding-documents-by-array-of-dbrefs/6847532#6847532 for some further comments. – WiredPrairie Jun 04 '13 at 20:16
  • @WiredPrairie - Good call, thanks for those links. Looks like manual references are the way to go if I stick with Document-Oriented DBs. Stinks that if I have a result set of n elements, I have to perform n+1 queries just to do a join. Would you recommend any alternative NoSQL databases that would avoid this problem. – Adam Jun 04 '13 at 20:44
  • If your referenced docs aren't large, you might try the second link's technique (using `$in`). But, if the referenced docs are spread about in multiple collections, it won't help much. – WiredPrairie Jun 04 '13 at 20:48
  • there are no joins in MongoDB. So 1 is really the only option. – Asya Kamsky Jun 04 '13 at 23:38
  • Another way is trying to use embedded documents. I know that sometimes we afraid to explode our objects in the DB but my approach is to try it and fix it when it will be necessary (kind of a lean experiment). The benefit of embedded document is that you get them whenever you query your collections thus no need for joins. – Kfir Erez Jun 05 '13 at 18:25
  • @KfirErez - By embedded document, do you mean de-normalizing the data and repeating the needed data within the queried document? If so, do you know of any common way to achieve "eventually normal" data. Something like a routine that would update the original foreign document if need be? – Adam Jun 05 '13 at 19:07

1 Answers1

1

Sorry for the late response - here is a simple example of embedded document written in mongoose:

var postSchema = new Schema({
  author : {type : String}, 
  title : {type : String, require : true},
  content : {type : String, require : true},
  comment : {
    owner : {type : String},
    subject : {type: String, require},
    content : {type String, require}
  }
});  

The document here is the postSchema (well it is the schema but I guess you know what I mean).
The comment is the embedded document which you can see it is an object defined inside post.
The benefit is that you get the comment each time you call post without additional query however if you have many comments it makes the post document very large!

Kfir Erez
  • 3,280
  • 2
  • 18
  • 17
  • Thanks. I ended up having a second schema for the Comments so that I could reference them either by owner or by Post (to borrow the naming structure in your example). It seemed to be the best fit for my particular use cases, although I can see that with MongoDB and document-based databases you have to orient your thinking a little differently. – Adam Jun 24 '13 at 21:30
  • I agree, it take some time to make the switch from RDBMS to Document based DBs. – Kfir Erez Jul 18 '13 at 10:41