I recently started with Django and haven't stopped enjoying python/Django yet, but I'm currently struggling with a logical problem.
Situation (simplified):
class A(models.Model):
foo = models.CharField(max_length=255)
class B(models.Model):
bar = models.CharField(max_length=255)
foo = models.ForeignKey(A)
class C(models.Model):
title = models.CharField(max_length=255)
bar = models.ForeignKey(B)
class D(models.Model):
name = models.CharField(max_length=255)
title = models.ForeignKey(C)
bar = models.ForeignKey(B)
(The real use case consists of hundreds of these classes, yes it's a mess, it clearly proofs a bad database design, but I can't change anything about that)
I've created dynamic ModelForms on every class. The general purpose is to retrieve an excel file and inserting them into the right ModelForms within field validations etc. Every excel file has multiple sheets mapping to the Classes, the first row (header) describes the modelfields and all other rows represent the data.
The data comes completely unsorted, so usually the insert order without breaking the foreign key sequence would be A => B => C => D. But in this case the whole sequence could be like D => B => C => A. The problems strikes when I validate the first sheet D which doesn't validate because the related foreign key hasn't been defined yet.
The question is, how can I add all data and verify the referential integrity afterwards?
Thanks in advance!
Thanks for your help!
Actually all primary keys are derived from the root model, which holds the mapping table for all child tables. I didn't mention it in the first post as I wanted to keep the situation simple. Having said that, I can't change that (mess!) nor can I redesign the classes as they map to any existing (messy!) database. And to make this mess complete, every field is set to "not Null".
My second idea was to initially fill a mapping table (no real idea how to do that yet), and sort the incoming data by this. Sounds like monkey work, it's dirty and I don't like this idea myself, I hoped there were smarter ways.
Do you have any hints on any mathematical solutions to this problem? It's like spanning a tree on arbitrary data.
UPDATE:
I made two functions to solve this, haven't tested the error handling yet.
validate_tables: Looks for all tables related to the given app and saves a nested list (self.found_fields) in a dict (child: [parent, parent, (...)]).
gen_sequence: Writes into a list (self.sequence) with the right sequence mapping to the object_names.
Approvements welcome!
This is my current solution (snippet to get the idea)
def validate_tables(self):
app = get_app("testdata")
self.sequence = []
self.found_fields = {}
for model in get_models(app):
hits = []
for local_field in model._meta.local_fields:
if isinstance(local_field, models.ForeignKey):
hits.append(local_field.related.parent_model._meta.object_name)
self.found_fields.update({model._meta.object_name: hits})
if self.gen_sequence():
return True
else:
raise self.sequence_errors
def gen_sequence(self, unresolved=None):
if unresolved:
self.found_fields = unresolved
unresolved = {}
else:
unresolved = {}
for model in self.found_fields:
if ((all(parent in self.sequence for parent in self.found_fields[model])
and self.sequence)
or not self.found_fields[model]):
self.sequence.append(model)
else:
unresolved.update({model: self.found_fields[model]})
if unresolved == self.found_fields:
self.sequence_errors = unresolved
return False
elif not unresolved:
return self.gen_sequence
else:
return self.gen_sequence(unresolved)