2

I have a student model that already has too many fields including the name, nationality, address, language, travel history, etc of the student. It is as below:

class Student(Model):
    user = OneToOneField(CustomUser, on_delete=CASCADE)
    #  Too many other fields

A student has much more information I store in other tables with a OneToOne relationship with the student model such as:

class StudentIelts(Model):

    student = OneToOneField(Student, on_delete=CASCADE)
    has_ielts = BooleanField(default=False,)
    # 8 other fields for IELTS including the scores and the date
    # and file field for uploading the IELTS result

# I have other models for Toefl, GMAT, GRE, etc that 
# are related to the student model in the same manner through 
# a OneToOne relationship such as:

class StudentIBT(Model):

    student = OneToOneField(Student, on_delete=CASCADE)
    has_ibt = BooleanField(default=False,)
    # other fields

Should I merge the tables into one table or the current database schema is good?

The reason I chose this schema is because I was not comfortable working with a table with too many columns. The point is that for every student, there should be a table for IELTS and other models and, as a result, the number of rows in Student table is the same as the number of rows in the IELTS table, as an example.

Amin Ba
  • 1,603
  • 1
  • 13
  • 38
  • 1
    The only thing I'd add to the great answer below is that your "exam" models might be sharing some functionality. Consider using an [abstract](https://docs.djangoproject.com/en/stable/topics/db/models/#abstract-base-classes) `StudentExam` model so that you can encapsulate common behaviour in the parent class (e.g. the `student` field and maybe a `clean()` method or a `score()` method) – dirkgroten Oct 01 '19 at 15:10
  • Ok. I am thinking of creating all StudentExam models the moment the Student model is created, the signup moment. I am more comfortable with this. Do you recommend doing it in the save method of my signup form or save method of my Student model. I am more comfortable with the second one because a Student model may be created from routes other than signupn. Do you recommend adding a save method to Student model to do this? – Amin Ba Oct 01 '19 at 15:19
  • Can’t answer that. Both are fine. There’s also the post_save signal that you could use. It’s a matter of preference and business requirements. – dirkgroten Oct 01 '19 at 15:22
  • With the post_save signal, can I make it atomic? I mean I am worried that I create a Student model but StudentExams model don't get created and my application crashes, because, in the view, I will rely on the fact that the models will be built – Amin Ba Oct 01 '19 at 15:26
  • 1
    First if it fails at creation it’ll report an error anyway and this would be an edge case. Second never rely on this in your view. When fetching the reverse relationship always handle exceptions. – dirkgroten Oct 01 '19 at 15:29

1 Answers1

4

This is a hard question to answer, with a lot of different opinions, but I would say you are correct in splitting up your relationship into two separate models.

However, there are several considerations to take into account.

When looking at this from a database design perspective, there is hardly any reason to split up your database tables. Whenever there is a one-to-one relationship that is always there, you should merge it into one table. The amount of columns hardly matters, unless you are optimising your database.

An answer from this question sums up the actual physical reasons to split up a 1-to-1 relationship quite nicely:

  • You might want to cluster or partition the two "endpoint" tables of a 1:1 relationship differently.
  • If your DBMS allows it, you might want to put them on different physical disks (e.g. more performance-critical on an SSD and the other on a cheap HDD).
  • You have measured the effect on caching and you want to make sure the "hot" columns are kept in cache, without "cold" columns "polluting" it.
  • You need a concurrency behavior (such as locking) that is "narrower" than the whole row. This is highly DBMS-specific.
  • You need different security on different columns, but your DBMS does not support column-level permissions.
  • Triggers are typically table-specific. While you can theoretically have just one table and have the trigger ignore the "wrong half" of the row, some databases may impose additional limits on what a trigger can and cannot do. For example, Oracle doesn't let you modify the so called "mutating" table from a row-level trigger - by having separate tables, only one of them may be mutating so you can still modify the other from your trigger (but there are other ways to work-around that).

Databases are very good at manipulating the data, so I wouldn't split the table just for the update performance, unless you have performed the actual benchmarks on representative amounts of data and concluded the performance difference is actually there and significant enough (e.g. to offset the increased need for JOINing).

Django's standpoint

If you look at the way Django is designed, there is some merit to splitting up your table into one-to-one relationships. One of Django's design philosophies is 'Loose coupling'. Which in Django's ecosystem means that separate applications shouldn't have to know about each other to function properly. In you case, it could be argued that a Student model shouldn't have to know anything about it's IELTS tests, because if you separate those two, the Student model could be reused in some other application. Also, some functionality that does some kind of analysis over IELTS tests, shouldn't have to 'know' anything about the student that took this test.

Do use this design pattern with some caution though. A good question to ask yourself would be not necessarily "How may columns do I have in my model?", because sometimes there is a good reason to have a lot of data in one model. So answering yes to this question alone would not necessarily merit splitting up your tables. A better question to ask yourself would be "Do I want to separate responsibilities/functionality of these two types of data?", which could be for any reason, like reusability or security.

Nico Griffioen
  • 5,143
  • 2
  • 27
  • 36
  • Separate responsibility of StudentIelts model is that I use this model to create a modelform ```IeltsForm(ModelForm)``` ```Model=StudentIelts``` . However, this is achievable, if the two models are merged, by ```Model=Student``` and specifying the fields. However, there is a problem with dividing. The first time a student visits the view in which the student fills his/her ielts, I need to do this: ``` student_ielts, student_ielts_created = StudentIelts.objects.get_or_create(student=student)``` So, what is your suggestion considering these additional issues I explained? – Amin Ba Oct 01 '19 at 08:40
  • I would say keep them separated, and use two different forms for Student and Ielts, and only allow logged in users to submit an IeltsForm, filling in the IELTS_student from the currently logged in user in the form.save() method. – Nico Griffioen Oct 01 '19 at 08:45
  • One other tip: I think the `Ielts` class shouldn't be called `StudentIelts`, but maybe just `Ielts`, or something like `IeltsResult`, and the field linking it to a student should just be called `student`. – Nico Griffioen Oct 01 '19 at 08:46
  • for the other tip, I agree with you. However, I have another model, Program. Each program has a required ielts score. The model should be Ielts to differentiate them. I even has named the fields as ```student_ielts_reading``` to distinguish them in view – Amin Ba Oct 01 '19 at 08:49
  • Every student "might" have a teacher. The student may enroll himself/herself or might be enrolled by his/her teacher. I do not know how to implement this. Should I define a different view to insert and update the Student model? And should I still keep these separated? – Amin Ba Oct 01 '19 at 08:51
  • That depends on a lot of things, but a simple way to do this is to have a StudentForm, and check there if a user that is also a teacher is logged in when the form is submitted. From there you can set the teacher field on the student in the form.save() method. – Nico Griffioen Oct 01 '19 at 08:56
  • One last question. As I said I do this: ``` student_ielts, student_ielts_created = StudentIelts.objects.get_or_create(student=student)``` Instead, I can create the studentIelts, StudentIbt, StudenGRE, etc record in their corresponding table the moment the student signup in the save method of the student sign up form. which one is better? – Amin Ba Oct 01 '19 at 09:00
  • I think the first option is better, since you don't want the Student model to be responsible for those relationships. – Nico Griffioen Oct 01 '19 at 09:04
  • I asked this question about StudentIelts model: https://stackoverflow.com/questions/58181871/how-to-do-customize-model-save-method-instead-of-form-clean-method-and-is-it-adv – Amin Ba Oct 01 '19 at 09:34