39

As I understand it, when one creates a Django application, data is validated by the form before it's inserted into a model instance which is then written to the database. But if I want to create an additional layer of protection at the data model layer, is what I've done below the current "best practice?" I'm trying to ensure that a reviewer's name cannot be omitted nor be left blank. Should I be putting any custom validation in the 'clean' method as I've done here and then have 'save' call 'full_clean" which calls 'clean'? If not, what's the preferred method? Thanks.

class Reviewer(models.Model):
    name = models.CharField(max_length=128, default=None)

    def clean(self, *args, **kwargs):
        if self.name == '':
            raise ValidationError('Reviewer name cannot be blank')
        super(Reviewer, self).clean(*args, **kwargs)

    def full_clean(self, *args, **kwargs):
        return self.clean(*args, **kwargs)

    def save(self, *args, **kwargs):
        self.full_clean()
        super(Reviewer, self).save(*args, **kwargs)
Serjik
  • 10,543
  • 8
  • 61
  • 70
Jim
  • 13,430
  • 26
  • 104
  • 155

4 Answers4

45

Firstly, you shouldn't override full_clean as you have done. From the django docs on full_clean:

Model.full_clean(exclude=None)
This method calls Model.clean_fields(), Model.clean(), and Model.validate_unique(), in that order and raises a ValidationError that has a message_dict attribute containing errors from all three stages.

So the full_clean method already calls clean, but by overriding it, you've prevented it calling the other two methods.

Secondly, calling full_clean in the save method is a trade off. Note that full_clean is already called when model forms are validated, e.g. in the Django admin. So if you call full_clean in the save method, then the method will run twice.

It's not usually expected for the save method to raise a validation error, somebody might call save and not catch the resulting error. However, I like that you call full_clean rather than doing the check in the save method itself - this approach allows model forms to catch the problem first.

Finally, your clean method would work, but you can actually handle your example case in the model field itself. Define your CharField as

name = models.CharField(max_length=128)

The blank option will default to False. If the field is blank, a ValidationError will be raised when you run full_clean. Putting default=None in your CharField doesn't do any harm, but it is a bit confusing when you don't actually allow None as a value.

Alasdair
  • 298,606
  • 55
  • 578
  • 516
  • Thanks Aladair. You're correct about full_clean already calling clean so I've taken the latter out. To your second point, I'm doing this additional validation in the event that I try to change the database from the Python shell rather than the Django admin I/F. I'm thinking this additional validation is worth the slight overhead. As to your final point, if you remove 'default=None", you can then omit the name value entirely and no exception is raised. I don't want this. Leaving that in and calling clean prevents me from both entering a blank value or omitting the name field altogether. – Jim Oct 18 '12 at 03:07
  • Sorry I misspelled "Alasdair." – Jim Oct 18 '12 at 03:28
  • 1
    I disagree about `default=None`. I just tested `r=Reviewer(); r.full_clean()` and it raises a `ValidationError {'name': [u'This field cannot be blank.']}`. Anyway, it's unrelated to your main question, so not too important. – Alasdair Oct 18 '12 at 09:58
  • My main concern about running validation twice would be if the `clean` method is expensive e.g. it involves lots of db queries. In this case, I agree it's not much of an overhead. Note that even if you implement this, there are still ways to get invalid data into the db. For example `Review.objects.update(name="")`. – Alasdair Oct 18 '12 at 10:02
  • I'm going to experiment with this more today. I struggled with this problem last weekend before knowing about the methods you've mentioned (http://stackoverflow.com/questions/12879256/why-does-django-orm-allow-me-to-omit-parameters-for-not-null-fields-when-creatin). I thought Django did no model-only validation whatsoever. You seem to say I can force it to with full_clean. I want to do model validation plus constraint-checking as well. More later... And thanks! – Jim Oct 18 '12 at 15:33
  • Here's a solution - create a save_if_valid method on your models. Call that from the shell (or elsewhere in your code). Modelforms will continue to use the normal save method so your validators only get called once. – Andy Baker Dec 16 '13 at 15:21
  • To ensure that the clean() method is called only once, can I set a flag variable in my model (ex: _is_cleanded=False) and check if the clean() method is already called (by the ModelForm) or not?? – Fabrizio A Oct 09 '14 at 13:18
7

After thinking about Alasdair's answer and doing addtional reading, my sense now is that Django's models weren't designed so as to be validated on a model-only basis as I'm attempting to do. Such validation can be done, but at a cost, and it entails using validation methods in ways they weren't intended for.

Instead, I now believe that any constraints other than those that can be entered directly into the model field declarations (e.g. "unique=True") are supposed to be performed as a part of Form or ModelForm validation. If one wants to guard against entering invalid data into a project's database via any other means (e.g. via the ORM while working within the Python interpreter), then the validation should take place within the database itself. Thus, validation could be implemented on three levels: 1) First, implement all constraints and triggers via DDL in the database; 2) Implement any constraints available to your model fields (e.g. "unique=True"); and 3) Implement all other constraints and validations that mirror your database-level constraints and triggers within your Forms and ModelForms. With this approach, any form validation errors can be re-displayed to the user. And if the programmer is interacting directly with the database via the ORM, he/she would see the database exceptions directly.

Thoughts anyone?

Jim
  • 13,430
  • 26
  • 104
  • 155
  • For example, if, in my model, I define: "name = models.CharField(max_length=128, default=None)", I am prevented from doing "Movie.objects.create()" because the model will send the database a null value which causes it to raise an integrity error. If I also add the check constraint "CHECK (name <> '')" to my Postgres database, it will raise an integrity error if I try to do "Movie.objects.create(name='')". – Jim Oct 19 '12 at 00:45
  • 14
    I find extra logic in the database icky. Where do you add this logic? Where do you test it? How do you keep it in sync across installs? Testing? Version control? If you are going to do this then you should find a way to integrate it into your model declarations so it is visible in the Python code along with everything else. A custom field type? A model mixin? – Andy Baker Dec 16 '13 at 15:19
  • 1
    Django models only touch the database upon save. Therefore it makes sense to do the validation in the model class, not in the database, because this way you can detect and correct invalid values before touching the database. – Quant Metropolis Jul 23 '14 at 11:21
6

Capturing the pre-save signals on on my models ensured clean will be called automatically.

from django.db.models.signals import pre_save

def validate_model(sender, **kwargs):
    if 'raw' in kwargs and not kwargs['raw']:
        kwargs['instance'].full_clean()

pre_save.connect(validate_model, dispatch_uid='validate_models')
Kevin Parker
  • 16,975
  • 20
  • 76
  • 105
0

Thanks @Kevin Parker for your answer, quite helpful!

It is common to have models in your app outside of the ones you define, so here is a modified version you can use to scope this behavior only to your own models or a specific app/module as desired.

from django.db.models.signals import pre_save
import inspect
import sys

MODELS = [obj for name, obj in
    inspect.getmembers(sys.modules[__name__], inspect.isclass)]

def validate_model(sender, instance, **kwargs):
    if 'raw' in kwargs and not kwargs['raw']:
        if type(instance) in MODELS:
            instance.full_clean()

pre_save.connect(validate_model, dispatch_uid='validate_models')

This code will run against any models defined inside the module where it is executed but you can adapt this to scope more strictly or to be a set of modules / apps if desired.

Matt Sanders
  • 8,023
  • 3
  • 37
  • 49