12

I have an MSSQL database which I am considering porting to CouchDB or MongoDB. I have a many-to-many relationship within the SQL db which has hundreds of thousands rows in the xref table, corresponding to tens of thousands of rows in the tables on each side of the relationship. Will CouchDB and/or MongoDB be able to handle this data, and what would be the best way of formatting the relevant documents for performant querying? Thanks very much.

Journeyman
  • 10,011
  • 16
  • 81
  • 129
  • See also answers to [this question](http://stackoverflow.com/questions/2336700/mongodb-many-to-many-association) and [this question](http://stackoverflow.com/questions/4839881/how-to-organise-a-many-to-many-relationship-in-mongodb) – Matthew Murdoch Jul 27 '12 at 08:23

2 Answers2

8

For CouchDB, I would highly recommend reading this article about Entity Relationships.

One thing I would note in CouchDB is to be careful of attempting to "normalize" a non-relational data model. The document-based storage offers you a great deal of flexibility, and it's seldom the best idea to abstract everything into as many "document types" as you can think of. Many times, it's best to leave much of your data within the same document unless you have clear cases where separate entities exist.

One common use-case of many-to-many relationships is implementing tagging. There are articles about different methods you can use to accomplish this in CouchDB. It may apply to your requirements, it may not, but it's probably worth a read.

Dominic Barnes
  • 28,083
  • 8
  • 65
  • 90
  • very interesting articles. My key document types will be Person and Company, with a Many-to-many between these two types. If I have tens or hundreds of thousands of documents for each type, with a separate document for each relationship, do you think the performance will be ok? Thanks very much. – Journeyman Mar 31 '11 at 16:35
  • I would **not** create separate documents to define the relationships. I would either have the Company document contain a list of Persons, or have the Person document contain a list of Companies. I think the first option makes more sense in most cases, but you may decide differently. – Dominic Barnes Mar 31 '11 at 17:44
  • Do you think the performance will be okay with such a solution ? I'm not familiar with CouchDB's performances but, I mean, hundreds of thousands of companies containing hundred of thousands of persons ? (It seems huge to me :) – Arnaud Mar 31 '11 at 22:04
  • You'd have to benchmark and see for yourself, unfortunately. There is always the alternative of putting a list of companies on each person, which may work better with large collections like this. – Dominic Barnes Apr 01 '11 at 13:58
3

Since the 'collection' model of MongoDB is similar to tables you can of course maintain the m:n relationship inside a dedicated mapping collection (using the _id of the related documents of the referenced documents from other collections).

If you can: consider redesign your application using embedded documents.

http://www.mongodb.org/display/DOCS/Schema+Design

In general: try to turn off your memories to a RDBMS when working with MongoDB. Blindly copying the database design from RDBMS to MongoDB is neither helpful nor adviceable nor will it work in general.

  • thanks. My two main tables are separate 'silos', and are joined via the xref table, and I am totally constrained in terms of these two core collections. The reason I want to use Mongo is so the main silos can be flexible in their structure. – Journeyman Mar 31 '11 at 11:56