7

My models don't really matter in this case, this is a fundamental Python question, I suppose.

Say I have a queryset of items and I want to calculate some things for each one to be displayed in a template.

In my view, I can create a list of objects, and for each object I can set a property on that object for the calculation, then I can display that in the template. OR I can create a list of dictionaries and only get the fields I need to display in each dictionary along with the calculated field. Which is better for performance, and in general practice?

An overly-simplified example for clarity (I know I can call getAge() from the template, what I am really calculated is more complex and for performance I want to do the calculations in the view code):

models.py:

class Person(models.Model):
    first_name = ...
    last_name = ...
    date_of_birth = ...
    .
    .
    .
    def getAge(self):
        return ... # return the calculated years since date_of_birth

views.py:

def method1_object_property(request):
    people = Person.objects.all()
    for p in people:
        p.age = p.getAge()
    return render_to_response('template.htm', {'people': people})

def method2_dictionary(request):
    people = Person.objects.all()
    data = list()
    for p in people:
        row = dict()
        row['first_name'] = p.first_name
        row['last_name'] = p.last_name
        row['age'] = p.getAge()
        data.append(row)
    return render_to_response('template.htm', {'people': data})

template.htm:

<ul>
    {% for p in people %}
        {{ p.first_name }} {{ p.last_name }} (Age: {{ p.age }})
    {% endfor %}
</ul>

Both methods work just fine so far as I can tell, I was just curious what the preferred method would be and why. Are there performance issues assigning new fields dynamically to an existing object in memory using the object dot property method (object.new_field = 'some_detail')?

UPDATE:

Yes, I know in my example I can call getAge() from template, and yes, this is the incorrect naming standard for methods which should be lowercase with underscores. I think my example is too simple and is clouding what I really want to know.

What is the best way to add information to an object that I want displayed in the view that is not a part of the model layer. Say I get a QuerySet of Person objects and want to calculate how many times they have logged into my website in the last 30, 60 and 90 days. I want to create three "properties" for each Person object on the fly. I can set this in the view with

for p in people:
    p.last_30 = Login.objects.filter(person=p, login_date__gt=date.today()-timedelta(days=30))
    p.last_60 = Login.objects.filter(person=p, login_date__gt=date.today()-timedelta(days=60))
    p.last_90 = Login.objects.filter(person=p, login_date__gt=date.today()-timedelta(days=90))

Then in my template I can display those "properties." I just wanted to make sure I'm not violating some Python standard or cheating the system. Alternately, I could store these other lookups in a dictionary with the object in one key/pair, and the various details in separate ones. This is a bit more work in the view, but I was curious if it is better for performance or compliance of standards to do so?

Sorry if my original question was not clear enough, or my example added confusion.

Furbeenator
  • 8,106
  • 4
  • 46
  • 54
  • dictionaries vs properties is irrelevant performance-wise, you need to focus on reducing the total number of database queries. see updated answer – Anentropic Dec 27 '13 at 02:48

2 Answers2

8

Definitely method 1.

Method 2 is pointless, you can iterate over the queryset directly in the template, there is no need to build up an intermediate 'list of dicts' in your view. eg you just can do:

def method2_dictionary(request):
    people = Person.objects.all()
    return render_to_response('template.htm', {'people': people})

in your template:

{% for p in people %}
    {{ p.first_name }}
    etc
{% endfor %}

Coming back to method 1...

This: p.age = p.getAge() is also pointless, you can directly call the method in your template as {{ p.getAge }} (as long as your method does not take arguments) see the docs here:
https://docs.djangoproject.com/en/dev/topics/templates/#accessing-method-calls

Note that in Python we generally prefer to use 'lower-case with underscores' for method names, eg def get_age(self) and {{ p.get_age }}
(see the official 'PEP8' style guide for Python here http://www.python.org/dev/peps/pep-0008/#function-names)

If your get_age method has no side-effects and takes no arguments you may like to make it a property which is Python's way of having a getter method you can access without the parentheses.

In this case it would make sense to name it just age:

@property
def age(self):
    return ... # return the calculated years since date_of_birth

and in your template:

{% for p in people %}
    {{ p.first_name }}
    {{ p.age }}
    etc
{% endfor %}

For more info about Python properties, see here:
http://docs.python.org/2/library/functions.html#property

Some more info in this SO question:
Real world example about how to use property feature in python?

UPDATE

Referring to your updated question... as a question of style I would still make these (last_30 etc) methods on the model rather than adding ad hoc properties onto each model instance in the view code.

From a performance perspective, the difference in memory, processing time etc of method lookup vs dictionaries etc is trivial in most real world situations ...by far the biggest performance consideration in this kind of code is usually the number of database queries.

If you know you're going to do an extra query (or three) for each item in your queryset it's worth looking for ways to get everything in one or more big queries.

In some cases you may be able to use the annotate() method:
https://docs.djangoproject.com/en/dev/ref/models/querysets/#annotate

(I don't think that's possible in your example)

In your specific code above you only need to query for the 90 days (oldest interval) and you could filter the 60 and 30 days sets from that in Python without querying the db again.

But that would still be doing one extra query per item in your people queryset. It'd be better to do one big query for the Login objects for all (or whatever subset) of the people. Since there is a foreign key relation for Person on Login we can use select_related() to get the Person instances in one big query when we query Login model:

def method3(request):
    logins = Login.objects.filter(
        person__in=Person.objects.all(),
        login_date__gt=date.today() - timedelta(days=90)
    ).order_by('person', 'login_date').select_related()
    return render_to_response('template.htm', {'logins': logins})

Note if you're really doing Person.objects.all() you wouldn't need the person__in filter above, only if you wanted to filter the Person set in some way.

Now that we got all the data in one big query we can do what we need to on the python side to display the data. eg in the template we could use the regroup tag:

{% regroup logins by person as people %}
{% for person in people %}
    {% with person.grouper as p %}
        {{ p.first_name }}
        {% for login in person.list %}
            {{ login.login_date }}
        {% endfor %}
    {% endwith %}
{% endfor %}

You could take this further and write a custom tag for the login date ranges... I won't detail that here but in your template it could look something like:

{% regroup logins by person as people %}
{% for person in people %}
    {% with person.grouper as p %}
        {{ p.first_name }}
        {% logins_since person.list 60 as last_60_days %}
        {% logins_since person.list 30 as last_30_days %}
        {% for login in last_30_days %}
            {{ login.login_date }}
        {% endfor %}
        {% for login in last_60_days %}
            {{ login.login_date }}
        {% endfor %}
    {% endwith %}
{% endfor %}
Community
  • 1
  • 1
Anentropic
  • 32,188
  • 12
  • 99
  • 147
  • 2
    An added advantage of directly using model methods is testability: you can easily test single model methods and properties; testing a whole view 'blob' can be more difficult – sk1p Dec 26 '13 at 23:59
  • 1
    yes, also the point of the model is to keep all that kind of logic in one place, so you can use it across multiple views etc – Anentropic Dec 27 '13 at 00:05
  • Yes, thank you for the response. I understand about calling the method directly in the template, but let's say it is a much more DB intensive set of calculations. It is preferred not to call DB functions from the template, although it is possible to do so. Perhaps my example was bad. I shouldn't have used a class method but instead want to calculate some things in the view. Let's say I'm looking up all the login sessions for a particular Person for the last 30 days, but there isn't a class method for that. Is it still better to assign, say another Queryset, to a property on the fly in Python? – Furbeenator Dec 27 '13 at 00:12
  • Why not create a method for that, if there isn't one yet? If you don't 'own' the model, i.e. if it's from a third party app, you can use a Proxy model; see the [source code of django-zinnia-blog](https://github.com/Fantomas42/django-blog-zinnia/blob/develop/zinnia/models/author.py) for a nice example. – sk1p Dec 27 '13 at 01:08
  • @Anentropic it would all be so simple if one could call `person.logins_since(30)` etc from the template. Yet instead we have to write custom tags to do things like that... – Andy Feb 26 '17 at 03:55
1

Don't bother with dictionaries. By looking at those two methods, I'm failing to understand what real problem second one solves. From template's perspective both methods produce same outcome, but first one is far shorter than second one.

However, there are some issues I see in your code:

First, if you really care about performace, you should avoid performing needless work. The age-setting step in first method is not really best way to solve problem and it's memory usage will grow as you are adding new persons to database.

Did you know that you can use functions/methods that don't accept any arguments (or just "self" argument in case of methods) in templates like they were attributes? If you rename "getAge" to "age", you can simplify first method code down to this:

def method1_object_property(request):
    people = Person.objects.all()
    return render_to_response('template.htm', {'people': people})

Also please take while to familiarize yourself with PEP8 document that regulates conventions for writing python code: http://www.python.org/dev/peps/pep-0008/

As per PEP8 "getAge" is not correct function name, and underscore lowercase should be used, ergo "get_age" is good while "getAge" is "unpythonic". However because this function is basically dynamically counted attribute, you can leave it as "age" and optionally add @property decorator to it, giving it same behaviour it does in django template in python code.

Now about optimisation. Defaut behaviour for Django query set upon evaluation is to convert all results returned by database to python objects. So if you have 2 rows in table, Person.objects.all() will produce two Person objects. But if you have 9000 rows, it will produce 9000 python objects that will immediately consume large amounts of memory.

You have two ways to defend yourself against this:

First, you can limit queryset to specified number of items, either by hardcoding it to fetch, say 5 latest members, or by implementing pagination, or finally by makind display of profiles come after user enters search criteria for persons.

Limiting ("slicing") querysets is covered by Django docs here: https://docs.djangoproject.com/en/1.6/topics/db/queries/#limiting-querysets

Secondly, you can make django use lazy approach to turning database rows into python objects by adding .iterator() at the end of your query. This will make django turn rows into objects as they are returned by queryset which is more memory friendly, but imposes some limitations on your code because instead of list-like object it you will get generator object.

This change will make Django create Person object for result row once, use it to display row on list, and then throw it away.

Ralfp
  • 480
  • 5
  • 10
  • Thank you, I will look at the `.iterator()` more in the future. What I really want to know is if it is better to store some calculations, or related querysets as properties specified with the object dot property method or if I should store those variables in a dictionary instead of as object properties. Say, I don't have them as class methods, but as a separate queryset/calculation in the view. – Furbeenator Dec 27 '13 at 00:42