4

I have to sort a list of objects containing hostnames.

The hostnames are in these formats: h1, h5, h10, h12, h12-abc, h1000, x10

If i use order_by('hostname') it will order like this:

h1, h10, h1000, h12, h12-abc, h5, x10

How would i achieve an ordering like this:

h1, h5, h10, h12, h12-abc, h1000, x10

The hostnames always begin with a char, then 1-4 digits and partly an extension, like for example '-abc'.

I guess i have to use Substr() to extract the number and order the numbers somehow, that '10' will not be listed before '5'.

With a search i found some old examples with extra() but the Django documentation says it will be deprecated in future and 'Use this method as a last resort' https://docs.djangoproject.com/en/2.1/ref/models/querysets/#extra

What is a future-proof way to do it?

Brown Bear
  • 19,655
  • 10
  • 58
  • 76
fuser60596
  • 1,087
  • 1
  • 12
  • 26
  • Looks like these two answers should help you: https://stackoverflow.com/questions/5967500/how-to-correctly-sort-a-string-with-a-number-inside and https://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort – Dan Swain Feb 20 '19 at 15:30
  • 1
    @DanSwain i think the OP want to use the django orm – Brown Bear Feb 20 '19 at 15:34
  • what database do you use? – Brown Bear Feb 20 '19 at 15:37
  • @BearBrown PostgreSQL – fuser60596 Feb 20 '19 at 15:43
  • 1
    Have you looked at [db functions](https://docs.djangoproject.com/en/2.1/ref/models/database-functions/#substr)? I wrote down this idea, but I have no idea if it would work, but may be a starting point `Hosts.objects.annotate( letter=Substr("hostname", 0, 1), extension=StrIndex(F("hostname"), Value("-")), ).annotate( numerical=Cast(Substr("hostname", 1, Coalesce("extension", Value(None))), IntegerField()), ).order_by("extensions", "numerical")` – mfrackowiak Feb 20 '19 at 15:45

2 Answers2

5

you can use the f-expressions

from django.db.models import F, Value, TextField, IntegerField
from django.contrib.postgres.fields import ArrayField
from django.db.models.expressions import Func

sql = ordModel.objects.annotate(
        num=Cast(
            Func(
                F('hostname'),
                Value("\d+"),
                function='regexp_matches',
            ),
            output_field=ArrayField(IntegerField())
        ),
        char=Func(
            F('hostname'),
            Value("\D+"),
            function='regexp_matches',
            output_field=ArrayField(TextField())
        )
    ).order_by('char', 'num', ).values('hostname')

my result for the same list of values is:

<QuerySet [
{'hostname': 'h1'},
{'hostname': 'h5'},
{'hostname': 'h10'},
{'hostname': 'h12'},
{'hostname': 'h12-abc'},
{'hostname': 'h1000'},
{'hostname': 'x10'}]>

about the database function you can readL regexp_match

Brown Bear
  • 19,655
  • 10
  • 58
  • 76
  • 1
    output looks great, but where is `function='regexp_matches'` coming from? – fuser60596 Feb 20 '19 at 21:12
  • 2
    this is the postgres function https://www.postgresql.org/docs/10/functions-matching.html#FUNCTIONS-POSIX-REGEXP – Brown Bear Feb 20 '19 at 21:13
  • on a second view , the order is not really correct. it orders like order_by(hostname) , which i try to avoid. it should be h1, h5, h10, h12, h12-abc, h1000, x10 - not h1, h10, h1000, h12 and so on - any idea how to fix this? – fuser60596 Feb 20 '19 at 21:36
  • @FelixK i fixed the regexp now the result should be fine. – Brown Bear Feb 21 '19 at 06:39
  • For PostgreSQL this works, so i will accept this as the answer. Below you will find my alternative database independent solution, which works too. Thanks! – fuser60596 Feb 21 '19 at 16:59
2

I got it to work with an extra field normalized_hostname, which is also database independent. Implemented in the model with the help of Django Signals pre_save()

https://docs.djangoproject.com/en/2.1/ref/signals/#pre-save

The code below transformes the hostname to a format, which then can be used with order_by('normalized_hostname')

Examples:
hostname -> normalized_hostname

h1 -> h0001 
h5 -> h0005, 
h10 -> h0010 
h12 -> h0012
h12-abc -> h0012-abc 
h1000 -> h1000 
x10 -> x0010

models.py

from django.db.models.signals import pre_save
import re

class MyModel(models.Model):
  the solution is also database independent  hostname = models.CharField(max_length=64)
    normalized_hostname = models.CharField(max_length=64)



def create_normalize_hostname(instance):
    normalize = re.sub("\D", "", instance.hostname).zfill(4)
    normalized_hostname = re.sub("(\d{1,4})", normalize, instance.hostname)
    return normalized_hostname

def receiver(sender, instance, *args, **kwargs)
    instance.normalized_hostname = create_normalize_hostname(instance)

pre_save.connect(receiver, sender=ModelName)

Now it will order like this:

h1, h5, h10, h12, h12-abc, h1000, x10
fuser60596
  • 1,087
  • 1
  • 12
  • 26