0

I have a problem modeling data structure in MongoDB. Here are my considerations:

  1. Let's say I have a parent object and a child object. Parent can contain many child objects. And I can have probably infinite number of parent objects (but let's define it as hundreds of thousands) and probably at most one thousand child object in one parent. The problem is that one of parent's property is calculated basing on children list. So, when I will link relationship to the parent in child, then I will have to update parent when adding a new child, which isn't supported in MongoDB as atomic operation (two different documents). When I embed children list into parent, then I probably wouldn't have to store this property in parent, because it could be calculated on the fly. Additionally, if I retrieve list of parents, all children will be downloaded, which isn't entirely good for a page with parent only list. Also, on parent details I would like to make a pagination for child objects, but they always will be downloaded all, right? Which approach is better here?
  2. Dictionaries in MongoDB. Let's say I have a property which is a dictionary value and have a document which holds this dictionary. And it can look like [ObjectId1: 'A', ObjectId2: 'B', ObjectId3: 'C']. Now, if I will define property in document to be linked to this one, I have a problem how can I join this? If I retrieve objects with property from this dictionary, how can I make it so that dictionary values are in the property instead of object ids? If I would insert there values, then I have no control over what value is inserted there and whether it is valid.
pshemu
  • 169
  • 2
  • 10
  • As stated, I think these questions are too broad and have too many possible answers. Can you ask the questions in the context of a specific application or use case, where there's a chance we can determine one approach that is superior to others? – wdberkeley Mar 19 '15 at 14:53

1 Answers1

0
  1. "Additionally, if I retrieve list of parents, all children will be downloaded, which isn't entirely good for a page with parent only list."

You can solve that problem very easily by specifying the desired/undesired fields. that is if children are within a subdocument say children. in the mongo shell, then

db.parent.find( {}, { children: 0 } )

If not falling under a single key then one must know in advance their keys which is insane and improbable.

db.parent.find( {}, { child1: 0, child2, 0 ... child1000: 0 } )

If atomicity is important then that solves (only?) that issue.

One cannot easily paginate on child objects. You will have to do some dorky and inefficient proprietary array slicing code, while a collection/table for children will permit very simple pagination using the skip and limit.

db.children.find({}).skip(5).limit(10)

A single collection is recommended for simplicity and atomicity --but given 1000 child objects (that does demand pagination as you rightly suggest) then a second collection is beneficial.

  1. What are you talking about willis. your example [ObjectId1: 'A', ObjectId2: 'B', ObjectId3: 'C'] is an array of dictionaries. that could be an regular "object": {ObjectId1: 'A', ObjectId2: 'B', ObjectId3: 'C'} and then just do a simple find using dot.notation object.ObjectId1 or i must be missing something
Gabe Rainbow
  • 3,658
  • 4
  • 32
  • 42
  • 2. I may explain this from relational database world, is I am more familiar with. We have a dictionary of activities: 1 Swimming, 2 Running, 3 Riding a bike. We have a person table, which contains activity column, and has values 1,2,3 in it. But to the user, I would like to display activity name, so normally I would join that with activity table. How can I achieve that in Mongo? – pshemu Mar 20 '15 at 08:54
  • 1. so I can have list of children nested, which will remove atomicity problem and download problem, but I will not have proper pagination? Isn't it possible to provide index of children table that I would like to retrieve? – pshemu Mar 20 '15 at 08:58
  • 1. You can index children on a subdocument. Here is a SO response. http://stackoverflow.com/questions/6104860/indexing-on-a-field-which-is-in-array-of-subdocuments. That doesn't solve paging through the 1000 children. And imho you want that; Just like Google only shows you 10 results per page (versus returning 200,000 possible results). Your I/O would be far too slow for usability. Paging is critical. – Gabe Rainbow Mar 20 '15 at 22:35
  • 2. Ok. You can just include the {"activities": "Running"} or an array if one to all, {"activites": ["Running", "Riding", "Swimming"] }. That is acceptable in fast, cheap NoSQL world that is hampered by no joins. Plus you can run simple finds using db.people.find("activities": "Running"}) that will give you all people who are running for example. Or more efficiently with paging db.people.find("activities": "Running"}).skip(0).limit(10) – Gabe Rainbow Mar 20 '15 at 22:40
  • 2. You will find in the non-relational world you will often reproduce data. It makes for fast req/resp time but very frustrating for lookup tables. For example, you changed your mind and "Running" is now called "Jogging" you will be forced to update all people with running as their activity. So ensure those terms are final. And of course you can use 1,2,3 and simply use those string constants in your UI. Up to you. Both are good. – Gabe Rainbow Mar 21 '15 at 02:07
  • yeah, I see now that I should probably state in every record {activitiy: "Running"}, instead of {activity: 1} and the lookup what 1 means. But what if in the future I would like to change it to somerhig else, like "Runnig alone", or I would like to translate it to different languages. And because this is information from UI, I am worried that somebody can input "Abcd" as activity, which value I schould control. – pshemu Mar 21 '15 at 08:21
  • db.person.update( {"activity": "Running"}, { "$set": {"activity" : "Jogging"}}, {"upsert" false, "multi": true ) you're point about subverting the input via the UI is true -- but then it must also hold for 4, 5, 6, 7, .... – Gabe Rainbow Mar 23 '15 at 01:04