3

I have a Django application that must use a database provided by another source. I am not able to change the schema. Everything below was generated by inspectdb:

class Family(models.Model):
    persistanceid = models.BigIntegerField(primary_key=True)

class Spouseref(models.Model):
    persistanceid = models.BigIntegerField(primary_key=True)

class Family_spouseref(models.Model):
    family_persistanceid = models.ForeignKey(Family, db_column='family_persistanceid')
    wiferefs_persistanceid = models.ForeignKey(Spouseref, unique=True, null=True, db_column='wiferefs_persistanceid', blank=True, related_name='wiferefs')
    husbandrefs_persistanceid = models.ForeignKey(Spouseref, unique=True, null=True, db_column='husbandrefs_persistanceid', blank=True, related_name='husbandrefs')
    childrefs_persistanceid = models.ForeignKey(Spouseref, unique=True, null=True, db_column='childrefs_persistanceid', blank=True, related_name='childrefs')

A ManyToManyField won't work under Family, since the mapping table, Family_spouseref, has multiple foreign keys to Spouseref. Given this schema, what's the most efficient way to accomplish something like this:

fam = Family.objects.get(persistanceid=id)
husbands = fam.husbands.all()
wives = fam.wives.all()
children = fam.children.all()

Update:

As requested... the following functions are what I tried to add within the Family model above.

I tried raw SQL like so:

    def husbands(self):
        return Spouseref.objects.raw ('SELECT * FROM spouseref WHERE spouseref.persistanceid IN (SELECT husbandrefs_persistanceid FROM family_spouseref WHERE family_persistanceid=%s);', [self.persistanceid])

Which works, but is raw SQL really the best way? I lose out on a lot of the usual object methods this way, it seems like.

When I tried doing it through the ORM, it kept wanting to think that the Family_spouseref table had a column id, which it clearly does not.

Family_spouseref.objects.get(family_persistanceid=self.persistanceid)

Produced:

[DEBUG] django.db.backends: (0.001) SELECT "family_spouseref"."id", "family_spouseref"."family_persistanceid", "family_spouseref"."wiferefs_persistanceid", "family_spouseref"."husbandrefs_persistanceid", "family_spouseref"."childrefs_persistanceid" FROM "family_spouseref" WHERE "family_spouseref"."family_persistanceid" = 45036  LIMIT 21; args=(45036,)
DatabaseError: column family_spouseref.id does not exist
LINE 1: SELECT "family_spouseref"."id", "family_spouseref"."family_p...

I also tried:

Family_spouseref.objects.get(Q(family_persistanceid=self.persistanceid) | Q(husbandrefs_persistanceid__isnull=False))

Which produced the same ID error but did attempt to follow the FK relationship:

[DEBUG] django.db.backends: (0.000) SELECT "family_spouseref"."id", "family_spouseref"."family_persistanceid", "family_spouseref"."wiferefs_persistanceid", "family_spouseref"."husbandrefs_persistanceid", "family_spouseref"."childrefs_persistanceid" FROM "family_spouseref" LEFT OUTER JOIN "spouseref" ON ("family_spouseref"."husbandrefs_persistanceid" = "spouseref"."persistanceid") WHERE ("family_spouseref"."family_persistanceid" = 45036  OR "spouseref"."persistanceid" IS NOT NULL) LIMIT 21; args=(45036,)
DatabaseError: column family_spouseref.id does not exist
LINE 1: SELECT "family_spouseref"."id", "family_spouseref"."family_p...

I'm also horrified to see by the debug how often it's calling the same (bad) statement (8 times) vs. the raw SQL, which it calls twice. (Still, twice!)

I really would like to know the official Django way for traversing these tables rather than ignore the ORM entirely and do raw SQL. Thanks.

TAH
  • 1,658
  • 1
  • 19
  • 37
  • Notice my question was phrased, "most efficient way." Obviously in plain SQL the answer is trivial. I've tried a number of two-part or manual (i.e. direct SQL) strategies to hop from one table to the other -- clumsy, but workable -- but I left those out of my question b/c I'm fairly certain they're inefficient and not how Django is "meant" to handle data. I am hoping someone with Django expertise can help me better understand the "Django way." I don't see that a genuine request for insight into Django is in any way lazy on my part... – TAH Jun 09 '12 at 20:33
  • Perhaps we can point out how, and ways to improve them. – Ignacio Vazquez-Abrams Jun 09 '12 at 20:35
  • OK, updated. Have at it. Thanks. – TAH Jun 09 '12 at 21:51

1 Answers1

2

The error you are getting is because Django set by default an "id" field as primary key, if no field has "primary_key=True" declared as argument (you did that in Family model for instance). So when you try to get your model objects, you will always get an error like this:

DatabaseError: column family_spouseref.id does not exist

First of all, give that model a primary key. I bet it has one. If no, maybe you just want to read this. Now, about your question of efficiency, you can do this:

def husbands(self):
    husbandrefs_for_family = Family_spouseref.objects.filter(
        family_persistanceid=self.persistanceid,
    ).values_list('husbandrefs_persistanceid',flat=True)

    return Spouseref.objects.filter(
        persistanceid__in=husbandrefs_for_family,
    )

Although it seems that you are doing 2 queries, you will see that Django just make one (well, it's a query with a subquery like the one you did in SQL).

The values_list part of the subquery is used to bring just the columns of the table you want. I'ts the select husbandrefs_persistanceid part of the SLQ statement you want.

Hope it helps!

Community
  • 1
  • 1
marianobianchi
  • 8,238
  • 1
  • 20
  • 24
  • Thank you thank you so much! This is exactly what I was looking for! However... I can't test it with my code, because of the PK issue. I did not define this model, and I don't have a way of changing the schema (long story, but Django has to share its database with a separate Java application that owns the schema). It seems like until I can find a PK for that table (and it really does not have one -- well, I guess it could have a composite PK, but I know Django doesn't support that), I am totally out of luck? – TAH Jun 10 '12 at 02:46
  • No, you are not out of luck. Have a look at [this](http://stackoverflow.com/questions/605896/django-querying-read-only-view-with-no-primary-key). It's hard to believe that a SQL table doesn't have a pk, but it could be your case. If that's the case, how do you know if there are repeated objects? and how do you handle them? Maybe, an option is to set one field as pk and then add as a meta option that all fields are unique together. That way, django won't complain about a missing id field and i think you won't have any db problem... – marianobianchi Jun 10 '12 at 03:05
  • Try to test that code. Cause maybe (i haven't test it myself) using the "values_list" method avoids the pk issue and you are getting Spouserer objects, but that table has a pk so you won't get an error for that query. – marianobianchi Jun 10 '12 at 03:15
  • Sorry for the delay in getting back to you. Alas, same db error, "column family_spouseref.id does not exist." Once I addressed that, I could see that the SQL generated was not right: "SELECT "spouseref"."dtype", "spouseref"."persistanceid", "spouseref"."_pref", "spouseref"."ref", "spouseref"."_frel", "spouseref"."_mrel" FROM "spouseref" INNER JOIN "family_spouseref" ON ("spouseref"."persistanceid" = "family_spouseref"."husbandrefs_persistanceid") WHERE "family_spouseref".===>"id" IN (SELECT U0."husbandrefs_persistanceid" FROM "family_spouseref" U0 WHERE U0."family_persistanceid" = 45126 );" – TAH Jun 11 '12 at 22:10
  • Where it says ===>"id" above, the correct SQL should have husbandrefs_persistanceid. – TAH Jun 11 '12 at 22:11
  • OK, I resolved it. return Spouseref.objects.filter( persistanceid__in=husbandrefs_for_family, ) Thanks so much! I am in great shape now! – TAH Jun 11 '12 at 22:53
  • BTW, I now see that having made the fix in my previous comment, you are right -- I am now able to work around the lack of PK in that mapping table. Hurrah! I hear all of your concerns about not having one, but it comes from a vendor... ;-) – TAH Jun 11 '12 at 23:03
  • Oh, you are right... My bad! I was trying to use the "related_name" of "husbandrefs_persistanceid" but what you did is easier and error-free! I'm glad you resolved your problem! I'll edit my answer so the right query stays there in case this help somebody else... – marianobianchi Jun 12 '12 at 00:06