Storing data of reverse foreign keys

Question

I would like to store some info of a model in a model that the first model references.

Consider i have these classes (ids omitted)

class Student:
    currentClassroom = ForeignKey(Classroom)
    age = Integer

class Classroom:
    totalAgeOfStudents = Integer

So, if I add a student to a classroom, it's classroom should also be updated. In more complex cases, consider we also have Campus, City, and Province. So, when a student is added/updated, it's new and old classroom, classroom's campus, campus' city and city's province should also be updated. These models are fetched so often that calculating per-request is out of question here.

Before I migrated to node.js, i used to do something like this in django (with MySQL): Every model has a 'calculate' method. whenever a student is updated, it's old and new classrooms are flagged as dirty in this calculate method. After that, models flagged as dirty are processed, their calculate method is called, they set their fields accordingly (for example, they calculate total age of students by querying the student table), then if some other model should also change, they flag it dirty too. And so on.

I don't think this method is nice at all. I tried save/delete/update hooks in django (mongoose also have save/delete hooks), but the situation doesn't change: whenever I receive an event of a student, I first update the fields of his/her classroom, then trigger classroom's corresponding event and so on.

Is there a better way for this in mongoose that is more effective and nicer?

Bounty Edit: I know that in my example, I can flag classroom as dirty and then use aggregation methods of MongoDB to calculate related fields. I'm asking for a nicer approach.

If someone is going to downvote, he/she should at least explain his/her reasoning. — holgac, Jun 05 '14 at 09:16
What is "nice" and "not nice" and "effective" and "not effective" (and "more effective")? You seem to be describing a general & unavoidable approach: "recalculate dependant info". (If you've left anything out, it's "preferably incrementally".) How are we to describe something "different" and "better"? If you were patenting your old method from this desription... asking someone to implement it... do you think what it is and what its benefits and failings are are made clear here? (Rhetorical question. Answer: no.) What is this "this" that is not nice? And what is this "nice" that it is not? — philipxy, Jun 20 '14 at 08:13
nice -> not involving low level, db triggers, ie. doing the job in the server code itself. effective -> fast, which means, as you said, preferably incrementally. I think explanation of my old django implementation is not incomprehensible, although too general. And to clarify, I'm asking if there is a faster/nicer implementation possible. For example, using a redis server for sum calculation. The redis server would store summed fields and when a change occurs, the corresponding redis entry is changed (or deleted for recalculation). — holgac, Jun 20 '14 at 11:51

score 0 · Accepted Answer · answered Jun 20 '14 at 09:59

The tl;dr: Update your summaries (or whatever) by determining a dag on their combined expressions then evaluating in dag order from (parts of) base tables up through shared subexpressions to the various output tables (or parts of base tables). Incrementally where worth it.

That title... "Storing data of reverse foreign keys"

A foreign key has a certain sense of direction. But it describes a constraint. Constraints are truths about the database. The proposition corresponding to a foreign key constraint is an implication (->) in terms of table meanings and an inclusion (<=) in terms of table values. That's a kind of logical/arithmetic directionality. But it is irrelevant to querying. A FK declaration/existence might suggest that two tables will tend to be joined (and will help optimization if they are) but the declared FKs have nothing to do with the order in which any particular query's results or subexpressions depend on each other or base tables. But those are the relevant dependencies for the aggregations etc that you are updating.

Foreign keys are irrelevant to the meaning of queries in terms of the meanings of constituent tables. A properly normalized database does not have a direction. A query expression, ignoring arbitrary ordering of commutative operators and acknowledging equivalent rearrangements, can be considered (partial-)ordered inside out, ie leaves up. A collection of queries, because they can share subexpressions, define a directed acylic graph, with outermost query operators at the top and base table values at the bottom. It is this graph that you should be evaluating up through. This should be incremental where possible and worth the effort. Eg you don't need to recalculate the sum of debt over all students if you know the changes in debt for the few who changed.

So if your method--which is entirely unclear except in the most general terms and as hinted by your otherwise non-sequitur title--involves following foreign keys, don't. Follow the data. From the bases to the query results. FKs are only relevant to this in that they might affect what the best dag to evaluate is to optimize expression sharing. (Would that the DBMS would optimize for simultaneous queries.)

When you clarify your question, explain how your title and FKs in particular have anything to do with your question. Although perhaps your thinking them relevant has something to do with your dissatisfaction. In other words, the recalculation order you seek should be per queries not FKs.

Well actually, you're right, it's about optimizing the calculation for queries, not models themselves. But every time I fetch Classrooms matching some criteria, I also want to have their calculated fields (total age etc.). Since I might have many classrooms, campuses, cities etc., calculating per-query is not a good idea in terms of speed. Since the idea of 'total age of students' of a classroom is somewhat connected to the classroom instances, I thought it might be better to store that data in model itself. — holgac, Jun 20 '14 at 11:33
By the way, I think the title explains the situation pretty well. If students have one to many relation to classrooms, we can say students have FK to classrooms. And classrooms have (virtual) reverse FK to students. Since I want to calculate some data in reverse FKs of a classroom, the title is kind of explanatory. — holgac, Jun 20 '14 at 11:42
The calculations are for the *conceptual* queries you want. The request whose result you are "fetching". You can calculate and/or cache intermediate results in whatever layers you choose. Although only using the database would be a reasonable first design. — philipxy, Jun 21 '14 at 07:17
You write strangely of FKs. A FK is not a field/column or pointer. It is a constraint that there is exactly one instance/row matching a given field/column in another class/table. When told this an ORM gives just one pointer *instead of a set containing one pointer*. The single pointer is not itself a FK. (And classroom does not have a FK to student. You mean a "student-occupied classroom" (association/relationship) entity, the association/relationship being "student entity occupies classroom entity".) See my answer,a FK has a *sense* of direction but *constraints and queries do not*. — philipxy, Jun 22 '14 at 00:27
I didn't imply that classrooms have a FK to students. It's the reverse, actually. And, since when defining FKs, you create a field for it (in this instance, Student.currentClassroom) that stores the id of the FK'ed object (Classroom.id). I don't think discussing the definitions of the terms are helping at all. Yeah you define a FK then ORM creates a pointer for you, or you can manually create a field that points to the referenced object (by storing one of it's unique fields) that does the exact same job except constraints. But this has absolutely nothing to do with my problem at all. — holgac, Jun 22 '14 at 10:06

Storing data of reverse foreign keys

1 Answers1