3

I've been reading a bit about document design for MongoDB, specifically relating to referencing, embedding and linking.

I'm still not sure what the best layout would be for the kind of documents I wish to store.

Imagine if you will, a type of dynamic form building application. Each form will have a set of static attributes (name, description, etc), a set of dynamically created fields (name = text, age = number, etc), and a set of submitted results.

Something like this:

$obj = array(
    'name' => 'Test',
    'description' => 'Hello',
    'live' => false,
    'meta' => array (
        'name' => 'text',
        'age' => 'number',
        'sex' => 'boolean',
    ),
    'records' => array(
        array('name' => 'Test', 'age' => 20, 'sex' => true),
        ...
        ...
    )
);

My confusion mainly comes from whether I should be embedding the records within the collection, or if I should create individual collections for each form. My intuition is to do the latter, while the MongoDB books I have been reading state that embedding data is ok.

Can someone please assist, given that I expect forms to have large quantities of submitted data.

Thanks.

Edit: I also found this question to be helpful.

Community
  • 1
  • 1
Leigh
  • 12,859
  • 3
  • 39
  • 60
  • This question belongs on programmers.stackexchange.com – Ben Lee Feb 20 '12 at 15:17
  • That said, the answer is "it depends". If you only need to access/query the records in the context of particular forms, embed them. If you need to access/query them independently, you may want to create a separate collection. – Ben Lee Feb 20 '12 at 15:19
  • @BenLee: If I will require geospatial indexing for certain field types, is the separate collection approach more favourable? I can't think of any cases where form submissions would be required outside of the context of the form, however in terms of performance would I be correct in thinking mixing records for different forms in a single collection will create non-contiguous data on disk, and reduce query performance as more and more records are submitted? – Leigh Feb 20 '12 at 15:30

2 Answers2

3

I would store all the form definitions in one collection, and the form submissions in another collection. You need to realize that one document (including embedded documents) can only be 16 MB right now.

It makes a lot more sense to have the submissions in a separate collection. Whether you want to have a different collection for each of your form types that is up to you. It depends a bit on how much each form will be alike, and how many different types you're going to get.

You will not have to be afraid that MongoDB will create a non-contiguous data on disk as Mongo stores each document separately, and doesn't care which fields are part of the document.

Derick
  • 35,169
  • 5
  • 76
  • 99
  • I probably mis-remembered something about the pre-allocation, and thought it was at collection level rather than db level. I can't find where I read it now either. Thanks for confirming my intuition was leading me down the right path with regard to breaking out the records into separate collections. The 16MB limit **will** be a problem if I embed. There is no telling how similar or different the forms will be, they're dynamic and down to the customer to design, I think a separate collection for each will be the way to go. Cheers. – Leigh Feb 20 '12 at 16:01
0

Also, storing records along with form definition would create unnecessary levels of hierarchy, which will give you pain while reading data of appropriate level.

Ravi Khakhkhar
  • 1,924
  • 1
  • 18
  • 25