366

In my model I have :

class Alias(MyBaseModel):
    remote_image = models.URLField(
        max_length=500, null=True,
        help_text='''
            A URL that is downloaded and cached for the image.
            Only used when the alias is made
        '''
    )
    image = models.ImageField(
        upload_to='alias', default='alias-default.png',
        help_text="An image representing the alias"
    )

    
    def save(self, *args, **kw):
        if (not self.image or self.image.name == 'alias-default.png') and self.remote_image :
            try :
                data = utils.fetch(self.remote_image)
                image = StringIO.StringIO(data)
                image = Image.open(image)
                buf = StringIO.StringIO()
                image.save(buf, format='PNG')
                self.image.save(
                    hashlib.md5(self.string_id).hexdigest() + ".png", ContentFile(buf.getvalue())
                )
            except IOError :
                pass

Which works great for the first time the remote_image changes.

How can I fetch a new image when someone has modified the remote_image on the alias? And secondly, is there a better way to cache a remote image?

hashlash
  • 897
  • 8
  • 19
Paul Tarjan
  • 48,968
  • 59
  • 172
  • 213

28 Answers28

531

Essentially, you want to override the __init__ method of models.Model so that you keep a copy of the original value. This makes it so that you don't have to do another DB lookup (which is always a good thing).

    class Person(models.Model):
        name = models.CharField()

        __original_name = None

        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)
            self.__original_name = self.name

        def save(self, force_insert=False, force_update=False, *args, **kwargs):
            if self.name != self.__original_name:
                # name changed - do something here

            super().save(force_insert, force_update, *args, **kwargs)
            self.__original_name = self.name
Cesar Canassa
  • 18,659
  • 11
  • 66
  • 69
Josh
  • 12,896
  • 4
  • 48
  • 49
  • 32
    instead of overwriting init, I'd use the post_init-signal http://docs.djangoproject.com/en/dev/ref/signals/#post-init – vikingosegundo Nov 24 '09 at 22:43
  • 40
    Overriding methods is recommended by the Django documentation: http://docs.djangoproject.com/en/dev/topics/db/models/#overriding-predefined-model-methods – Colonel Sponsz Aug 30 '10 at 19:55
  • 2
    Changing the instance passed by the post_init signal doesn't work. You have to override the init method. – Cesar Canassa Nov 18 '10 at 18:40
  • 8
    One thing to note (that seems fairly obvious now but just bit me for a half hour) is that if you do this, you'll never be able to defer() 'name' again, as it will be constantly checking for self.name on init. One sloppy workaround is to check if it exists in `self.__dict__` and skip setting `__original_name` if not. There's probably a better way, just haven't found it quite yet. – umbrae Jul 11 '11 at 18:49
  • 2
    I don't understand it's necessary to have `self.__original_name = self.name` at the end of the `save()` method. Can anyone explain? – callum Jan 31 '12 at 13:18
  • 14
    @callum so that if you make changes to the object, save it, then makes additional changes and call `save()` on it AGAIN, it will still work correctly. – philfreo Mar 14 '12 at 22:57
  • 1
    @Josh - thanks for this answer! One misc style question. Why do you use two underscores instead of just one for `__original_name`? – Ghopper21 Aug 07 '12 at 00:53
  • 1
    @Ghopper21 quite welcome. If you're asking about the difference between two underscores and one, check this out: http://stackoverflow.com/questions/1301346/the-meaning-of-a-single-and-a-double-underscore-before-an-object-name-in-python. If you're asking why I chose 2 underscores rather than 1, it's because I wanted that variable to be private - can't have other code messing with its value - and 2 underscores is as private as python gets. – Josh Aug 07 '12 at 01:04
  • @umbrae -- If you skip setting __original_name then that defeats this technique doesn't it? Scenario: name has a value in the database, but is deferred so __original_name is not set. a) The user submits a new value for name b) The user simply reads the existing value. You then check __original_name != name, which is True in both cases, and you can't tell whether a new value has been submitted. – Panayiotis Karabassis Aug 22 '12 at 09:00
  • Furthermore: c) User deletes name, and __original_name == name, even though a change has taken place. – Panayiotis Karabassis Aug 22 '12 at 09:16
  • 26
    @Josh won't there be a problem if you have several application servers working against the same database as it only tracks changes in memory – Jens Alm Sep 02 '12 at 07:36
  • 2
    @JensAlm I think that depends on your specific application. However, I think that, in most cases, this will work fine. The question we're asking is "did the user do something that changed the value". In most cases, this catches the true positive or negative. I suppose there would be some confusion if the user was working in multiple tabs at the same time. However, if I were that user, I think my expectations of the behavior would change. – Josh Sep 03 '12 at 14:35
  • 5
    Overriding __init__ of a model is not recommended by django doc https://docs.djangoproject.com/en/dev/ref/models/instances/?from=olddocs#django.db.models.Model – lajarre Nov 23 '12 at 18:08
  • 16
    @lajarre, I think your comment is a bit misleading. The docs suggest that you take care when you do so. They don't recommend against it. – Josh Nov 23 '12 at 21:18
  • 1
    I added something similar to my model and get " object has no attribute '__original_name'". Does it work on existing data? – jul Dec 06 '12 at 17:21
  • @Jul, you have to add the private variable __original_name and, assuming you named it that way, you're not going to be able to access it from outside the model instance. In any case, yes, it will work on existing data. – Josh Dec 07 '12 at 14:17
  • 1
    Coming in really late on this, but one thing I noticed when using this approach with a FK is that there are a lot of extra DB hits and select_related does not seem to fix it. I believe one of the pre-save options is best. – Esteban Oct 11 '13 at 19:21
  • I'm sure the extra db lookups can be avoided. Keep in mind that if you have an object car with an FK to person via the attribute owner, every time you call `car.owner`, you'll perform a db lookup. However, if you do `owner = car.owner` and use `owner` in the future, you'll avoid the extra lookups. I think this behavior has changed in the newest version of Django though. In any case, as I mentioned previously, I'm sure you can avoid the extra lookups with proper handling. – Josh Oct 11 '13 at 19:31
  • 1
    I think if you want to avoid doing this with new instances, you can throw in `if self.pk:` before the property is set in `__init__`. – John Chadwick Nov 27 '13 at 00:40
  • Sometimes it is really hard to debug `signal`s in big projects, but the usage is justified. Instead of searching all parts of the code to change `save` logic you could use signal. I do not know how big the project is and which method exactly is the best. But, overriding the `__init__` method seems to be fine if it is not overwhelmed. Also note that `__init__` itself uses `pre_init` and `post_init` signals. Also do not forget that all `post.pk` will add a number of requests to database. – boldnik Jan 06 '14 at 21:20
  • 5
    This won't work if you initialize a model in the constructor. The values clearly have changed but they're already set by the time Model.__init__() returns, so they will not have appeared to change later when they are checked. – Nathan Osman Jul 16 '14 at 02:56
  • 4
    @JensAlm is right. This code will deceptively work fine on your single-process testing server, but the moment you deploy it to any sort of multi-processing server, it will give completely wrong results. – rspeer Jan 09 '15 at 22:06
  • 6
    You do not need multiple processes for this to fail. Just create two different model instances pointing to the same DB row. Change one and save it. Then save the second one (without changing it). The second save will result in a change to the DB (back to the original value), but the method described in this answer will not detect it. – Frank Pape Aug 26 '16 at 14:21
  • 1
    This solution is far from great and i would advise you don't use it when you need to load related objects or defer loading of some properties. The way django works with `select_related` is it: 1st creates an instance (executes models `__init__`, than follows the `post_init` signal) and only after that the properties that are related are populated. So in such cases usage of `select_related` is meaningless and you will be loading every related property with a separate query (because it will get fetched before related properties are set). This can be a big performance hit to your app. – greginvm Oct 12 '16 at 14:05
  • 2
    Please note that it *doesn't* work with deferred fields (using .only() on queryset). If one would use Person.objects.only('id').first(), Django 1.8 would get into infinite recursion. – marxin Nov 16 '16 at 14:19
  • @ColonelSponsz Yes, overriding methods is not discouraged, but there is a [big note](https://docs.djangoproject.com/en/1.10/ref/models/instances/#creating-objects) on overriding `__init__` – gdvalderrama Jan 23 '17 at 19:18
  • IMHO, model is not a good place to do such work, especially things that was asked in this question. Controller will be much better place. – Vladimir Prudnikov Apr 26 '19 at 11:44
  • 2
    It works good, but you have to handle the adding new object `if not self.id or self.name != self.__original_name:` – Ibrahim Tayseer Feb 16 '20 at 19:08
  • Style question: why `def save(self, force_insert=False, force_update=False, *args, **kwargs)` and not just `def save(self, *args, **kwargs)` ? Then later `super().save(*args, **kwargs)` instead of `super().save(force_insert, force_update, *args, **kwargs)`, I mean. Thanks – Mario Orlandi Dec 10 '21 at 17:27
  • 1
    @MarioOrlandi As posted, there's not really a good reason for it. This is an excerpt of code I was using in a project. In the project, I was making use of those specific arguments, and it was just a quick way for me to get at them. You can also just extract them from kwargs, of course. – Josh Dec 20 '21 at 23:21
  • 3
    Be careful when using this method. I implemented this successfully for a while on an enterprise application. Ended up causing a 'python recursion limit' error when being deleted as a referenced object later in the project. Looked at and tried many different strategies and finally had to change this to making a quick read query (with race condition safety check) within the save method instead. I also need to mention that this didn't show up until I upgraded to Django 3.2.12 from Django 2.x.x. and from Python 3.7 to Python 3.9 – Rob Barber Mar 25 '22 at 16:45
  • @RobBarber I also (sadly) faced a 'python recursion limit' error with Django 4.0.4, solved following your suggestion. Could you kindly elaborate "with race condition safety check" ? maby thanks – Mario Orlandi May 23 '22 at 10:52
  • 2
    @MarioOrlandi The race condition safety check is just a quick query to the database to make sure you still have the most "up to date" model instance. There is no guarantee that by the time you get to the save method that the model's data is valid. So I just do a quick check on when the model was last updated and compare it to what I expect. For small sites this may not even be an issue, I just like to cover my basis. If it will 100% be an issue then database transactions might work better. – Rob Barber May 23 '22 at 13:24
  • Thank you @RobBarber .. indeed in my case it's not an issue, but I see your point. – Mario Orlandi May 23 '22 at 13:30
  • 1
    I have proposed edit to this answer with code that prevents 'python recursion limit' without making any new database query. May be interesting for @RobBarber . In short, consider checking if the desired field in `self.get_deferred_fields()` – David Jul 01 '22 at 17:22
268

I use following mixin:

from django.forms.models import model_to_dict


class ModelDiffMixin(object):
    """
    A model mixin that tracks model fields' values and provide some useful api
    to know what fields have been changed.
    """

    def __init__(self, *args, **kwargs):
        super(ModelDiffMixin, self).__init__(*args, **kwargs)
        self.__initial = self._dict

    @property
    def diff(self):
        d1 = self.__initial
        d2 = self._dict
        diffs = [(k, (v, d2[k])) for k, v in d1.items() if v != d2[k]]
        return dict(diffs)

    @property
    def has_changed(self):
        return bool(self.diff)

    @property
    def changed_fields(self):
        return self.diff.keys()

    def get_field_diff(self, field_name):
        """
        Returns a diff for field if it's changed and None otherwise.
        """
        return self.diff.get(field_name, None)

    def save(self, *args, **kwargs):
        """
        Saves model and set initial state.
        """
        super(ModelDiffMixin, self).save(*args, **kwargs)
        self.__initial = self._dict

    @property
    def _dict(self):
        return model_to_dict(self, fields=[field.name for field in
                             self._meta.fields])

Usage:

>>> p = Place()
>>> p.has_changed
False
>>> p.changed_fields
[]
>>> p.rank = 42
>>> p.has_changed
True
>>> p.changed_fields
['rank']
>>> p.diff
{'rank': (0, 42)}
>>> p.categories = [1, 3, 5]
>>> p.diff
{'categories': (None, [1, 3, 5]), 'rank': (0, 42)}
>>> p.get_field_diff('categories')
(None, [1, 3, 5])
>>> p.get_field_diff('rank')
(0, 42)
>>>

Note

Please note that this solution works well in context of current request only. Thus it's suitable primarily for simple cases. In concurrent environment where multiple requests can manipulate the same model instance at the same time, you definitely need a different approach.

iperelivskiy
  • 3,523
  • 2
  • 18
  • 19
  • 4
    Really perfect, and do not perform extra query. Thanks a lot ! – Stéphane Mar 04 '13 at 10:26
  • Any advice on how to ignore a type change? Its considering this a difference: {'field_name': (0L, u'0')} – IMFletcher Sep 03 '13 at 15:45
  • @IMFletcher In your case you deal with uncleaned data assigned to a model field. This sort of thing is out of scope of this mixin. You may try first clean data with a model form that would populate your model fields for free on saving. Or manually, i.e. model_instance.field_name = model_form.cleaned_data['field_name'] – iperelivskiy Sep 04 '13 at 19:33
  • @livskiy Agree. I ended up using TypedChoiceField instead to ensure the id's were convered to ints from strings before doing the diff. Thanks for the fantastic mixin! – IMFletcher Sep 05 '13 at 03:07
  • model_to_dict does not return fields where the editable=False is set and changes to these fields could be important (they might be invoice totals for example). For more general utility, replace the call to model_to_dict with {field.name: field.value_from_object(self) for field in self._meta.fields} – Paul Whipp Feb 07 '14 at 07:24
  • 9
    Mixin is great, but this version has problems when used together with .only(). The call to Model.objects.only('id') will lead to infinite recursion if Model has at least 3 fields. To solve this, we should remove deferred fields from saving in initial and change _dict property [a bit](https://gist.github.com/pitsevich/d8cf357df3b927cf13a2) – gleb.pitsevich Jun 06 '14 at 14:00
  • I had the recursion problem, using a .raw() call. pitsevich's fix worked for me. Thanks. – Lee Hinde Sep 24 '14 at 21:12
  • 28
    Much like Josh's answer, this code will deceptively work fine on your single-process testing server, but the moment you deploy it to any sort of multi-processing server, it will give incorrect results. You can't know if you're changing the value in the database without querying the database. – rspeer Jan 09 '15 at 22:10
  • This does not work for changes to many-to-many relationships (at least in django 1.10). The reason is `_dict` is using `.fields`. That's also an internal django method, and so should not be used anyways. `get_fields()` is better. – theicfire Feb 06 '17 at 21:51
  • As mentioned by @Frank Pape above: You do not need multiple processes for this to fail. Just create two different model instances pointing to the same DB row. Change one and save it. Then the second one's diff will be wrong. – Arnaud P Apr 26 '17 at 16:57
  • @ArnaudP You're right. Again this solution is for simple cases as noted in the answer. It's not intended to cover 100% of all possible cases you can imagine. Btw I'm interested why do you want two model instances of the same row in the one request/response cycle or task or you-name-it flow? – iperelivskiy Apr 26 '17 at 17:34
  • @ivanperelivskiy I don't want two model instances of the same row. But it doesn't mean it's not going to happen someday when half a dozen more people have been tinkering with the growing code base :) After reading most of the comments on this page, I'm going for the extra db lookup, because my usage is not performance critical ATM, and I'd rather guard against entropy first. Just a choice – Arnaud P Apr 27 '17 at 07:42
  • Where do you place the mixin? in views.py? I am looking to use this for another model of mine where I have modified the update call in views.py – opunsoars May 14 '18 at 05:57
  • Perfect explanation – miltonbhowmick Jul 18 '20 at 10:15
  • This is a great solution, I'd love to know how to include watching for changes in a m2m relationship. – William Colmenares Oct 15 '20 at 18:30
  • @WilliamColmenares you can include m2m as well but you have to change `model_to_dict`: https://stackoverflow.com/a/29088221/5081021 (step 5) – Jurrian May 06 '21 at 13:28
  • 2
    On refresh_from_db we also should reinitialize initial state. `def refresh_from_db(self, using=None, fields=None): super().refresh_from_db(using, fields) self.__initial = self._dict` – Lev Lybin Oct 08 '21 at 16:35
  • @gleb.pitsevich's link (avoiding recursion error) is now dead. updated link: https://gist.github.com/gpg90/d8cf357df3b927cf13a2. – Danilo Gómez Oct 31 '22 at 12:58
209

Best way is with a pre_save signal. May not have been an option back in '09 when this question was asked and answered, but anyone seeing this today should do it this way:

@receiver(pre_save, sender=MyModel)
def do_something_if_changed(sender, instance, **kwargs):
    try:
        obj = sender.objects.get(pk=instance.pk)
    except sender.DoesNotExist:
        pass # Object is new, so field hasn't technically changed, but you may want to do something else here.
    else:
        if not obj.some_field == instance.some_field: # Field has changed
            # do something
radtek
  • 34,210
  • 11
  • 144
  • 111
Chris Pratt
  • 232,153
  • 36
  • 385
  • 444
  • 8
    Why is this the best way if the method that Josh describes above doesn't involve an extra database hit? – joshcartme Oct 31 '11 at 21:54
  • 54
    1) that method is a hack, signals are basically designed for uses like this 2) that method requires making alterations to your model, this one does not 3) as you can read in the comments on that answer, it has side-effects that can be potentially problematic, this solution does not – Chris Pratt Oct 31 '11 at 22:10
  • 2
    This way is great if you only care about catching the change just prior to saving. However, this won't work if you want to react to the change immediately. I have come across the latter scenario many times (and I'm working on one such instance now). – Josh May 31 '12 at 21:17
  • 7
    @Josh: What do you mean by "react to the change immediately"? In what way does this not let you "react"? – Chris Pratt May 31 '12 at 21:48
  • 3
    Sorry, I forgot the scope of this question and was referring to an entirely different problem. That said, I think signals are a good way to go here (now that they're available). However, I find many people consider overriding save a "hack." I don't believe this is the case. As this answer suggests (http://stackoverflow.com/questions/170337/django-signals-vs-overriding-save-method), I think overriding is the best practice when you're not working on changes that are "specific to the model in question." That said, I don't intend to impose that belief on anyone. – Josh May 31 '12 at 22:21
  • 2
    @Josh: don't you mean changes that *are* specific to the model in question? Based on the answer you linked to, signals are best suited when you want to reuse the same code across models. Did I miss something? – Panayiotis Karabassis Aug 22 '12 at 09:20
  • I would prefer just overriding save method and doing basically the same thing there. I don't think it has any disadvantages when compared to signals. – clime Nov 29 '13 at 14:06
  • If you can get away with doing this at the form level you can leverage the forms cached instance object and compare it to the cleaned_data. See zgoda's answer. Also the solution using django-model-changes looks clean. – radtek Mar 11 '15 at 15:42
  • @radtek looks like zgoda's answer may cause a race condition for concurrent requests to edit the same model instance – Anupam May 30 '17 at 11:30
  • @Anupam This answer may also cause a race conditions for concurrent requests - it's just only possible in the time delta b/w the pre_save signal and the save. You might say that window is shorter, but by loading more actions in the pre_save signals, you increase the time for all other saves, increasing the likelihood of conflicts generally. The only solution to the problem you mention is database locks, which also has hazards and should not be used lightly at all. – AlanSE May 11 '18 at 15:55
  • This method saved me. The hacky method is a serious problem now if you want to implement raw SQL in your code. It causes a maximum recursion error because it tries to initialize the variables over and over in the `__init__` method. – rchurch4 Dec 20 '18 at 23:58
  • I've been doing this with signals for a while, nice to see some confirmation and sanity checking here. – Milo Persic Feb 10 '22 at 19:13
157

And now for direct answer: one way to check if the value for the field has changed is to fetch original data from database before saving instance. Consider this example:

class MyModel(models.Model):
    f1 = models.CharField(max_length=1)

    def save(self, *args, **kw):
        if self.pk is not None:
            orig = MyModel.objects.get(pk=self.pk)
            if orig.f1 != self.f1:
                print 'f1 changed'
        super(MyModel, self).save(*args, **kw)

The same thing applies when working with a form. You can detect it at the clean or save method of a ModelForm:

class MyModelForm(forms.ModelForm):

    def clean(self):
        cleaned_data = super(ProjectForm, self).clean()
        #if self.has_changed():  # new instance or existing updated (form has data to save)
        if self.instance.pk is not None:  # new instance only
            if self.instance.f1 != cleaned_data['f1']:
                print 'f1 changed'
        return cleaned_data

    class Meta:
        model = MyModel
        exclude = []
radtek
  • 34,210
  • 11
  • 144
  • 111
zgoda
  • 12,775
  • 4
  • 37
  • 46
  • 24
    Josh's solution is much more database friendly. An extra call to verify what's changed is expensive. – dd. Feb 23 '11 at 23:14
  • Would be nice here to consider f1 changed even if the model is being saved for the first time – Josh Bothun Feb 19 '13 at 22:44
  • 8
    One extra read before you do a write isn't that expensive. Also the tracking changes method doesn't work if there are multiple requests. Although this would suffer from a race condition in between fetching and saving. – dalore Feb 24 '16 at 13:01
  • 3
    Stop telling people to check `pk is not None` it doesn't apply for example if using a UUIDField. This is just bad advice. – user3467349 Jun 16 '16 at 18:54
  • 4
    @dalore you can avoid the race condition by decorating the save method with `@transaction.atomic` – Frank Pape Aug 26 '16 at 14:48
  • 3
    @dalore although you'd need to make sure the transaction isolation level is sufficient. In postgresql, default is read committed, but [repeatable read is necessary](https://docs.djangoproject.com/en/1.10/ref/databases/#isolation-level). – Frank Pape Aug 26 '16 at 15:05
67

Since Django 1.8 released, you can use from_db classmethod to cache old value of remote_image. Then in save method you can compare old and new value of field to check if the value has changed.

@classmethod
def from_db(cls, db, field_names, values):
    new = super(Alias, cls).from_db(db, field_names, values)
    # cache value went from the base
    new._loaded_remote_image = values[field_names.index('remote_image')]
    return new

def save(self, force_insert=False, force_update=False, using=None,
         update_fields=None):
    if (self._state.adding and self.remote_image) or \
        (not self._state.adding and self._loaded_remote_image != self.remote_image):
        # If it is first save and there is no cached remote_image but there is new one, 
        # or the value of remote_image has changed - do your stuff!
mcastle
  • 2,882
  • 3
  • 25
  • 43
Serge
  • 1,027
  • 9
  • 8
  • 3
    Thanks -- here's a reference to the docs: https://docs.djangoproject.com/en/1.8/ref/models/instances/#customizing-model-loading. I believe this still results in the aforementioned issue where the database may change between when this is evaluated and when the comparison is done, but this is a nice new option. – trpt4him Oct 22 '15 at 21:36
  • 1
    Rather than searching through values (which is O(n) based on number of values) wouldn't it be faster and clearer to do `new._loaded_remote_image = new.remote_image` ? – dalore Dec 02 '15 at 13:20
  • 1
    Unfortunately I have to reverse my previous (now deleted) comment. While `from_db` is called by `refresh_from_db`, the attributes on the instance (i.e. loaded or previous) are not updated. As a result, I can't find any reason why this is better than `__init__` as you still need to handle 3 cases: `__init__`/`from_db`, `refresh_from_db`, and `save`. – claytond Dec 07 '17 at 16:59
36

Note that field change tracking is available in django-model-utils.

https://django-model-utils.readthedocs.org/en/latest/index.html

Lee Hinde
  • 1,043
  • 12
  • 13
  • 6
    The [FieldTracker](https://django-model-utils.readthedocs.io/en/latest/utilities.html#field-tracker) from django-model-utils seems to work really well, thank you! – Greg Sadetsky Nov 30 '18 at 03:17
24

If you are using a form, you can use Form's changed_data (docs):

class AliasForm(ModelForm):

    def save(self, commit=True):
        if 'remote_image' in self.changed_data:
            # do things
            remote_image = self.cleaned_data['remote_image']
            do_things(remote_image)
        super(AliasForm, self).save(commit)

    class Meta:
        model = Alias
laffuste
  • 16,287
  • 8
  • 84
  • 91
11

I am a bit late to the party but I found this solution also: Django Dirty Fields

ramwin
  • 5,803
  • 3
  • 27
  • 29
Fred Campos
  • 1,457
  • 1
  • 19
  • 22
  • 3
    Looking at the tickets, looks like this package is not in an healthy condition right now (looking for maintainers, needing to change their CI by december 31st, etc.) – Overdrivr Dec 17 '20 at 11:02
9

Very late to the game, but this is a version of Chris Pratt's answer that protects against race conditions while sacrificing performance, by using a transaction block and select_for_update()

@receiver(pre_save, sender=MyModel)
@transaction.atomic
def do_something_if_changed(sender, instance, **kwargs):
    try:
        obj = sender.objects.select_for_update().get(pk=instance.pk)
    except sender.DoesNotExist:
        pass # Object is new, so field hasn't technically changed, but you may want to do something else here.
    else:
        if not obj.some_field == instance.some_field: # Field has changed
            # do something
baqyoteto
  • 334
  • 2
  • 9
  • The last will be the first!! Question.. anyone knows if is possible to get the user here in signals? – Lara Jul 27 '22 at 00:16
  • @Lara probably not in the signals but on my side I'm overwriting `ModelFormMixin.form_valid()` in my class based views to update some fields of the current instance (i.e. `updated_by`). Thus I can get/set the user with `self.request.user`... – scūriolus Apr 06 '23 at 13:21
7

Another late answer, but if you're just trying to see if a new file has been uploaded to a file field, try this: (adapted from Christopher Adams's comment on the link http://zmsmith.com/2010/05/django-check-if-a-field-has-changed/ in zach's comment here)

Updated link: https://web.archive.org/web/20130101010327/http://zmsmith.com:80/2010/05/django-check-if-a-field-has-changed/

def save(self, *args, **kw):
    from django.core.files.uploadedfile import UploadedFile
    if hasattr(self.image, 'file') and isinstance(self.image.file, UploadedFile) :
        # Handle FileFields as special cases, because the uploaded filename could be
        # the same as the filename that's already there even though there may
        # be different file contents.

        # if a file was just uploaded, the storage model with be UploadedFile
        # Do new file stuff here
        pass
Aaron McMillin
  • 2,532
  • 27
  • 42
  • That's an awesome solution for checking if a new file was uploaded. Much better than checking the name against database beause the name of the file could be the same. You can use it in `pre_save` receiver, too. Thanks for sharing this! – DataGreed Dec 02 '19 at 00:23
  • 1
    Here's an example for updating audio duration in a database when the file was updated using mutagen for reading audio info - https://gist.github.com/DataGreed/1ba46ca7387950abba2ff53baf70fec2 – DataGreed Dec 02 '19 at 00:43
6

There is an attribute __dict__ which have all the fields as the keys and value as the field values. So we can just compare two of them

Just change the save function of model to the function below

def save(self, force_insert=False, force_update=False, using=None, update_fields=None):
    if self.pk is not None:
        initial = A.objects.get(pk=self.pk)
        initial_json, final_json = initial.__dict__.copy(), self.__dict__.copy()
        initial_json.pop('_state'), final_json.pop('_state')
        only_changed_fields = {k: {'final_value': final_json[k], 'initial_value': initial_json[k]} for k in initial_json if final_json[k] != initial_json[k]}
        print(only_changed_fields)
    super(A, self).save(force_insert=False, force_update=False, using=None, update_fields=None)

Example Usage:

class A(models.Model):
    name = models.CharField(max_length=200, null=True, blank=True)
    senior = models.CharField(choices=choices, max_length=3)
    timestamp = models.DateTimeField(null=True, blank=True)

    def save(self, force_insert=False, force_update=False, using=None, update_fields=None):
        if self.pk is not None:
            initial = A.objects.get(pk=self.pk)
            initial_json, final_json = initial.__dict__.copy(), self.__dict__.copy()
            initial_json.pop('_state'), final_json.pop('_state')
            only_changed_fields = {k: {'final_value': final_json[k], 'initial_value': initial_json[k]} for k in initial_json if final_json[k] != initial_json[k]}
            print(only_changed_fields)
        super(A, self).save(force_insert=False, force_update=False, using=None, update_fields=None)

yields output with only those fields that have been changed

{'name': {'initial_value': '1234515', 'final_value': 'nim'}, 'senior': {'initial_value': 'no', 'final_value': 'yes'}}
Nimish Bansal
  • 1,719
  • 4
  • 20
  • 37
  • This works like a charm! You can also use that in pre_save signals where, if you need to make additional changes while updating the model itself, you can also make it race condition save as shown [here](https://stackoverflow.com/a/60312573). – R. Steigmeier Jun 10 '21 at 08:22
5

As of Django 1.8, there's the from_db method, as Serge mentions. In fact, the Django docs include this specific use case as an example:

https://docs.djangoproject.com/en/dev/ref/models/instances/#customizing-model-loading

Below is an example showing how to record the initial values of fields that are loaded from the database

Amichai Schreiber
  • 1,528
  • 16
  • 16
5

This works for me in Django 1.8

def clean(self):
    if self.cleaned_data['name'] != self.initial['name']:
        # Do something
jhrs21
  • 381
  • 1
  • 6
  • 21
4

You can use django-model-changes to do this without an additional database lookup:

from django.dispatch import receiver
from django_model_changes import ChangesMixin

class Alias(ChangesMixin, MyBaseModel):
   # your model

@receiver(pre_save, sender=Alias)
def do_something_if_changed(sender, instance, **kwargs):
    if 'remote_image' in instance.changes():
        # do something
Robert Kajic
  • 8,689
  • 4
  • 44
  • 43
3

The optimal solution is probably one that does not include an additional database read operation prior to saving the model instance, nor any further django-library. This is why laffuste's solutions is preferable. In the context of an admin site, one can simply override the save_model-method, and invoke the form's has_changed method there, just as in Sion's answer above. You arrive at something like this, drawing on Sion's example setting but using changed_data to get every possible change:

class ModelAdmin(admin.ModelAdmin):
   fields=['name','mode']
   def save_model(self, request, obj, form, change):
     form.changed_data #output could be ['name']
     #do somethin the changed name value...
     #call the super method
     super(self,ModelAdmin).save_model(request, obj, form, change)
  • Override save_model:

https://docs.djangoproject.com/en/1.10/ref/contrib/admin/#django.contrib.admin.ModelAdmin.save_model

  • Built-in changed_data-method for a Field:

https://docs.djangoproject.com/en/1.10/ref/forms/api/#django.forms.Form.changed_data

Daniel Holmes
  • 1,952
  • 2
  • 17
  • 28
user3061675
  • 161
  • 1
  • 4
2

While this doesn't actually answer your question, I'd go about this in a different way.

Simply clear the remote_image field after successfully saving the local copy. Then in your save method you can always update the image whenever remote_image isn't empty.

If you'd like to keep a reference to the url, you could use an non-editable boolean field to handle the caching flag rather than remote_image field itself.

SmileyChris
  • 10,578
  • 4
  • 40
  • 33
2

I had this situation before my solution was to override the pre_save() method of the target field class it will be called only if the field has been changed
useful with FileField example:

class PDFField(FileField):
    def pre_save(self, model_instance, add):
        # do some operations on your file 
        # if and only if you have changed the filefield

disadvantage:
not useful if you want to do any (post_save) operation like using the created object in some job (if certain field has changed)

MYaser
  • 369
  • 4
  • 14
2

I have extended the mixin of @livskiy as follows:

class ModelDiffMixin(models.Model):
    """
    A model mixin that tracks model fields' values and provide some useful api
    to know what fields have been changed.
    """
    _dict = DictField(editable=False)
    def __init__(self, *args, **kwargs):
        super(ModelDiffMixin, self).__init__(*args, **kwargs)
        self._initial = self._dict

    @property
    def diff(self):
        d1 = self._initial
        d2 = self._dict
        diffs = [(k, (v, d2[k])) for k, v in d1.items() if v != d2[k]]
        return dict(diffs)

    @property
    def has_changed(self):
        return bool(self.diff)

    @property
    def changed_fields(self):
        return self.diff.keys()

    def get_field_diff(self, field_name):
        """
        Returns a diff for field if it's changed and None otherwise.
        """
        return self.diff.get(field_name, None)

    def save(self, *args, **kwargs):
        """
        Saves model and set initial state.
        """
        object_dict = model_to_dict(self,
               fields=[field.name for field in self._meta.fields])
        for field in object_dict:
            # for FileFields
            if issubclass(object_dict[field].__class__, FieldFile):
                try:
                    object_dict[field] = object_dict[field].path
                except :
                    object_dict[field] = object_dict[field].name

            # TODO: add other non-serializable field types
        self._dict = object_dict
        super(ModelDiffMixin, self).save(*args, **kwargs)

    class Meta:
        abstract = True

and the DictField is:

class DictField(models.TextField):
    __metaclass__ = models.SubfieldBase
    description = "Stores a python dict"

    def __init__(self, *args, **kwargs):
        super(DictField, self).__init__(*args, **kwargs)

    def to_python(self, value):
        if not value:
            value = {}

        if isinstance(value, dict):
            return value

        return json.loads(value)

    def get_prep_value(self, value):
        if value is None:
            return value
        return json.dumps(value)

    def value_to_string(self, obj):
        value = self._get_val_from_obj(obj)
        return self.get_db_prep_value(value)

it can be used by extending it in your models a _dict field will be added when you sync/migrate and that field will store the state of your objects

MYaser
  • 369
  • 4
  • 14
2

improving @josh answer for all fields:

class Person(models.Model):
  name = models.CharField()

def __init__(self, *args, **kwargs):
    super(Person, self).__init__(*args, **kwargs)
    self._original_fields = dict([(field.attname, getattr(self, field.attname))
        for field in self._meta.local_fields if not isinstance(field, models.ForeignKey)])

def save(self, *args, **kwargs):
  if self.id:
    for field in self._meta.local_fields:
      if not isinstance(field, models.ForeignKey) and\
        self._original_fields[field.name] != getattr(self, field.name):
        # Do Something    
  super(Person, self).save(*args, **kwargs)

just to clarify, the getattr works to get fields like person.name with strings (i.e. getattr(person, "name")

Hassek
  • 8,715
  • 6
  • 47
  • 59
  • And it is still not making extra db queries? – andilabs Mar 30 '14 at 10:34
  • I was trying to implement your code. It works ok by editing fields. But now i have problem with inserting new. I get DoesNotExist for my FK field in class. Some hint how to solve it will be appreciated. – andilabs Mar 30 '14 at 11:20
  • I have just updated the code, it now skips the foreign keys so you don't need to fetch those files with extra queries (very expensive) and if the object doesn't exist it will skip the extra logic. – Hassek Mar 31 '14 at 16:08
2

My take on @iperelivskiy's solution: on large scale, creating the _initial dict for every __init__ is expensive, and most of the time - unnecessary. I have changed the mixin slightly such that it records changes only when you explicitly tell it to do so (by calling instance.track_changes):

from typing import KeysView, Optional
from django.forms import model_to_dict

class TrackChangesMixin:
    _snapshot: Optional[dict] = None

    def track_changes(self):
        self._snapshot = self.as_dict

    @property
    def diff(self) -> dict:
        if self._snapshot is None:
            raise ValueError("track_changes wasn't called, can't determine diff.")
        d1 = self._snapshot
        d2 = self.as_dict
        diffs = [(k, (v, d2[k])) for k, v in d1.items() if str(v) != str(d2[k])]
        return dict(diffs)

    @property
    def has_changed(self) -> bool:
        return bool(self.diff)

    @property
    def changed_fields(self) -> KeysView:
        return self.diff.keys()

    @property
    def as_dict(self) -> dict:
        return model_to_dict(self, fields=[field.name for field in self._meta.fields])
A. Kali
  • 739
  • 6
  • 19
  • 1
    I've had a long term issue with django getting recursion errors (specificially RecursionError: Maximum Recursion Depth Exceeded) when trying to delete some objects and I've not been able to figure it out. Turns out it was ModelDiffMixin. Replaced with your version and now it works. So Happy!!!! Thanks. – PhoebeB May 18 '22 at 12:41
2

I have found this package django-lifecycle. It uses django signals to define @hook decorator, which is very robust and reliable. I used it and it is a bliss.

icarus
  • 85
  • 2
  • 13
  • 1
    While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - [From Review](/review/late-answers/29961155) – Muhammad Dyas Yaskur Oct 01 '21 at 17:36
1

How about using David Cramer's solution:

http://cramer.io/2010/12/06/tracking-changes-to-fields-in-django/

I've had success using it like this:

@track_data('name')
class Mode(models.Model):
    name = models.CharField(max_length=5)
    mode = models.CharField(max_length=5)

    def save(self, *args, **kwargs):
        if self.has_changed('name'):
            print 'name changed'

    # OR #

    @classmethod
    def post_save(cls, sender, instance, created, **kwargs):
        if instance.has_changed('name'):
            print "Hooray!"
Sion
  • 161
  • 9
  • 3
    If you forget super(Mode, self).save(*args, **kwargs) then you're disabling the save function so remember to put this in the save method. – max Nov 14 '15 at 21:05
  • The link of the article is outdated, this is the new link: https://cra.mr/2010/12/06/tracking-changes-to-fields-in-django – GoTop May 03 '19 at 15:29
  • This answer is unfortunately incomplete. What is @track_data? The link no longer explains anything about this but redirects to the front page instead, which is why it's not good to depend on content in links being permanent. – Teekin May 06 '22 at 10:26
1

A modification to @ivanperelivskiy's answer:

@property
def _dict(self):
    ret = {}
    for field in self._meta.get_fields():
        if isinstance(field, ForeignObjectRel):
            # foreign objects might not have corresponding objects in the database.
            if hasattr(self, field.get_accessor_name()):
                ret[field.get_accessor_name()] = getattr(self, field.get_accessor_name())
            else:
                ret[field.get_accessor_name()] = None
        else:
            ret[field.attname] = getattr(self, field.attname)
    return ret

This uses django 1.10's public method get_fields instead. This makes the code more future proof, but more importantly also includes foreign keys and fields where editable=False.

For reference, here is the implementation of .fields

@cached_property
def fields(self):
    """
    Returns a list of all forward fields on the model and its parents,
    excluding ManyToManyFields.

    Private API intended only to be used by Django itself; get_fields()
    combined with filtering of field properties is the public API for
    obtaining this field list.
    """
    # For legacy reasons, the fields property should only contain forward
    # fields that are not private or with a m2m cardinality. Therefore we
    # pass these three filters as filters to the generator.
    # The third lambda is a longwinded way of checking f.related_model - we don't
    # use that property directly because related_model is a cached property,
    # and all the models may not have been loaded yet; we don't want to cache
    # the string reference to the related_model.
    def is_not_an_m2m_field(f):
        return not (f.is_relation and f.many_to_many)

    def is_not_a_generic_relation(f):
        return not (f.is_relation and f.one_to_many)

    def is_not_a_generic_foreign_key(f):
        return not (
            f.is_relation and f.many_to_one and not (hasattr(f.remote_field, 'model') and f.remote_field.model)
        )

    return make_immutable_fields_list(
        "fields",
        (f for f in self._get_fields(reverse=False)
         if is_not_an_m2m_field(f) and is_not_a_generic_relation(f) and is_not_a_generic_foreign_key(f))
    )
theicfire
  • 2,719
  • 2
  • 26
  • 29
0

as an extension of SmileyChris' answer, you can add a datetime field to the model for last_updated, and set some sort of limit for the max age you'll let it get to before checking for a change

Jiaaro
  • 74,485
  • 42
  • 169
  • 190
0

The mixin from @ivanlivski is great.

I've extended it to

  • Ensure it works with Decimal fields.
  • Expose properties to simplify usage

The updated code is available here: https://github.com/sknutsonsf/python-contrib/blob/master/src/django/utils/ModelDiffMixin.py

To help people new to Python or Django, I'll give a more complete example. This particular usage is to take a file from a data provider and ensure the records in the database reflect the file.

My model object:

class Station(ModelDiffMixin.ModelDiffMixin, models.Model):
    station_name = models.CharField(max_length=200)
    nearby_city = models.CharField(max_length=200)

    precipitation = models.DecimalField(max_digits=5, decimal_places=2)
    # <list of many other fields>

   def is_float_changed (self,v1, v2):
        ''' Compare two floating values to just two digit precision
        Override Default precision is 5 digits
        '''
        return abs (round (v1 - v2, 2)) > 0.01

The class that loads the file has these methods:

class UpdateWeather (object)
    # other methods omitted

    def update_stations (self, filename):
        # read all existing data 
        all_stations = models.Station.objects.all()
        self._existing_stations = {}

        # insert into a collection for referencing while we check if data exists
        for stn in all_stations.iterator():
            self._existing_stations[stn.id] = stn

        # read the file. result is array of objects in known column order
        data = read_tabbed_file(filename)

        # iterate rows from file and insert or update where needed
        for rownum in range(sh.nrows):
            self._update_row(sh.row(rownum));

        # now anything remaining in the collection is no longer active
        # since it was not found in the newest file
        # for now, delete that record
        # there should never be any of these if the file was created properly
        for stn in self._existing_stations.values():
            stn.delete()
            self._num_deleted = self._num_deleted+1


    def _update_row (self, rowdata):
        stnid = int(rowdata[0].value) 
        name = rowdata[1].value.strip()

        # skip the blank names where data source has ids with no data today
        if len(name) < 1:
            return

        # fetch rest of fields and do sanity test
        nearby_city = rowdata[2].value.strip()
        precip = rowdata[3].value

        if stnid in self._existing_stations:
            stn = self._existing_stations[stnid]
            del self._existing_stations[stnid]
            is_update = True;
        else:
            stn = models.Station()
            is_update = False;

        # object is new or old, don't care here            
        stn.id = stnid
        stn.station_name = name;
        stn.nearby_city = nearby_city
        stn.precipitation = precip

        # many other fields updated from the file 

        if is_update == True:

            # we use a model mixin to simplify detection of changes
            # at the cost of extra memory to store the objects            
            if stn.has_changed == True:
                self._num_updated = self._num_updated + 1;
                stn.save();
        else:
            self._num_created = self._num_created + 1;
            stn.save()
sknutsonsf
  • 81
  • 1
  • 2
0

Here is another way of doing it.

class Parameter(models.Model):

    def __init__(self, *args, **kwargs):
        super(Parameter, self).__init__(*args, **kwargs)
        self.__original_value = self.value

    def clean(self,*args,**kwargs):
        if self.__original_value == self.value:
            print("igual")
        else:
            print("distinto")

    def save(self,*args,**kwargs):
        self.full_clean()
        return super(Parameter, self).save(*args, **kwargs)
        self.__original_value = self.value

    key = models.CharField(max_length=24, db_index=True, unique=True)
    value = models.CharField(max_length=128)

As per documentation: validating objects

"The second step full_clean() performs is to call Model.clean(). This method should be overridden to perform custom validation on your model. This method should be used to provide custom model validation, and to modify attributes on your model if desired. For instance, you could use it to automatically provide a value for a field, or to do validation that requires access to more than a single field:"

Antwane
  • 20,760
  • 7
  • 51
  • 84
Gonzalo
  • 752
  • 8
  • 23
0

If you do not find interest in overriding save method, you can do

  model_fields = [f.name for f in YourModel._meta.get_fields()]
  valid_data = {
        key: new_data[key]
        for key in model_fields
        if key in new_data.keys()
  }

  for (key, value) in valid_data.items():
        if getattr(instance, key) != value:
           print ('Data has changed')

        setattr(instance, key, value)

 instance.save()
theTypan
  • 5,471
  • 6
  • 24
  • 29
0

Sometimes I want to check for changes on the same specific fields on multiple models that share those fields, so I define a list of those fields and use a signal. In this case, geocoding addresses only if something has changed, or if the entry is new:

from django.db.models.signals import pre_save
from django.dispatch import receiver

@receiver(pre_save, sender=SomeUserProfileModel)
@receiver(pre_save, sender=SomePlaceModel)
@receiver(pre_save, sender=SomeOrganizationModel)
@receiver(pre_save, sender=SomeContactInfoModel)
def geocode_address(sender, instance, *args, **kwargs):

    input_fields = ['address_line', 'address_line_2', 'city', 'state', 'postal_code', 'country']

    try:
        orig = sender.objects.get(id=instance.id)
        if orig:
            changes = 0
            for field in input_fields:
                if not (getattr(instance, field)) == (getattr(orig, field)):
                    changes += 1
            if changes > 0:
                # do something here because at least one field changed...
                my_geocoder_function(instance)
    except:
        # do something here because there is no original, or pass.
        my_geocoder_function(instance)

Writing it once and attaching with "@receiver" sure beats overriding multiple model save methods, but perhaps some others have better ideas.

Milo Persic
  • 985
  • 1
  • 7
  • 17