0

I am working on converting some relatively complex SQL into something that Django can play with. I am trying not to just use the raw SQL, since I think playing with the standard Django toolkit will help me learn more about Django.

I have already managed to break up parts of the sql into chunks, and am tackling them piecemeal to make things a little easier.

Here is the SQL in question:

SELECT i.year, i.brand, i.desc, i.colour, i.size, i.mpn, i.url, 
COALESCE(DATE_FORMAT(i_eta.eta, '%M %Y'),'Unknown') 
as eta 
FROM i 
JOIN i_eta ON i_eta.mpn = i.mpn 
WHERE category LIKE 'kids' 
ORDER BY i.brand, i.desc, i.colour, FIELD(size, 'xxl','xl','l','ml','m','s','xs','xxs') DESC, size+0, size

Here is what I have (trying to convert line by line):

(grabbed automatically when performing filters)
(have to figure out django doc on coalesce for syntax)
db alias haven't been able to find yet - it is crucial since there is a db view that requires it
already included in the original q
.select_related?
.filter(category="kids")
.objects.order_by('brand','desc','colour') - don't know how to deal with SQL FIELDS

Any advice would be appreciated!

dukehenry
  • 21
  • 1
  • 4
  • 1) Are you allowed to change the structure of the table? 2) Are you looking for something which is idiomatic Django, or are you looking to change the logic of the above statement as little as possible? – Nick ODell Dec 19 '21 at 03:20
  • I don't know enough about Django to know what the best course of action is. The SQL query with the associated view works well and is very fast. I would like to know the "best" way to do it in Django (whatever what means). If there are "best practices" that can convert the SQL statement to something that yields a similar result then I would go for that. – dukehenry Dec 19 '21 at 03:40

1 Answers1

0

Here's how I would structure this.

First, I'm assuming your models for i and i_eta look something like this:

class I(models.Model):
    mpn = models.CharField(max_length=30, primary_key=True)
    year = models.CharField(max_length=30)
    brand = models.CharField(max_length=30)
    desc = models.CharField(max_length=100)
    colour = models.CharField(max_length=30)
    size = models.CharField(max_length=3)

class IEta(models.Model):
    i = models.ForeignKey(I, on_delete=models.CASCADE)
    eta = models.DateField()

General thoughts:

  1. To write the coalesce in Django: I would not replace nulls with "Unknown" in the ORM. This is a presentation-layer concern: it should be dealt with in a template.

  2. For date formatting, you can do date formatting in Python.

  3. Not sure what a DB alias is.

  4. For using multiple tables together, you can use either select_related(), prefetch_related(), or do nothing.

    1. select_related() will perform a join.
    2. prefect_related() will get the foreign key ID's from the first queryset, then generate a query like SELECT * FROM table WHERE id in (12, 13, 14).
    3. Doing nothing will work automatically, but has the disadvantage of the SELECT N+1 problem.

    I generally prefer prefetch_related().

  5. For customizing the sort order of the size field, you have three options. My preference would be option 1, but any of the three will work.

    1. Denormalize the sort criteria. Add a new field called size_numeric. Override the save() method to populate this field when saving new instances, giving xxl the value 1, xl the value 2, etc.
    2. Sort in Python. Essentially, you use Python's built-in sorting methods to do the sort, rather than sorting it in the database. This doesn't work well if you have thousands of results.
    3. Invoke the MySQL function. Essentially, using annotate(), you add the output of a function to the queryset. order_by() can sort by that function.
Nick ODell
  • 15,465
  • 3
  • 32
  • 66
  • @dukehenry Django automatically adds an id field if your model lacks a primary key. See https://docs.djangoproject.com/en/4.0/topics/db/models/#automatic-primary-key-fields – Nick ODell Dec 19 '21 at 16:35
  • Thanks, I just found that in the docs. Is there a way I should have defined my model other than using mpn = models.CharField(primary_key=True, max_length=50) ? – dukehenry Dec 19 '21 at 16:38
  • @dukehenry Yes, that's right. Django does require that models have a primary key. If mpn fulfills the qualities of being a primary key, then there is no need to add an additional primary key. – Nick ODell Dec 19 '21 at 16:40
  • I have managed to have some success and can taste progress. I have created 2 separate queries that give me useful results, but need to merge them together and have run into trouble. 1. (base) query = i.objects.filter(category='kids').values('year', 'brand', 'desc', 'colour', 'size', 'mpn', 'url').order_by('brand','desc','colour') 2. (view) query = i_Eta.objects.values('eta') 3. (combined) query = i.objects.filter(category='kids').select_related('i_Eta', 'eta').values('year', 'brand', 'desc', 'colour', 'size', 'mpn', 'url').order_by('brand','desc','colour') – dukehenry Dec 19 '21 at 17:28
  • @dukehenry I would suggest you not use .values(), as it's easier to manipulate ORM objects than dictionaries. For example, you can get a reference to the object referenced by foreign key. If you need to restrict the select for performance reasons, I'd use .only() instead. – Nick ODell Dec 19 '21 at 17:36
  • If you do want to use .values(), you can also get the value of the foriegn key using a double underscore. Example: https://stackoverflow.com/questions/27180190/django-using-objects-values-and-get-foreignkey-data-in-template – Nick ODell Dec 19 '21 at 17:59
  • Am I correct that the start reference will generally pull all fields? If so, that is fine since I can use them all anyway. In that case, I would just remove .values() entirely. I am still confused as to why query = i.objects.filter(category='kids').select_related('i_Eta', 'eta').order_by('brand','desc','colour') doesn't work - I seem to get an error saying that the field names given in my select_related aren't valid. i_Eta is indeed a separate table from i so I thought that it would work. Back to the docs, thanks – dukehenry Dec 19 '21 at 18:12
  • @dukehenry IIRC, select_related() only works in one direction. If A has a foreign key to B, then you can't start with B and do a select_related() to A. I'm not exactly sure what causes that limitation, but it's the reason I usually do prefetch_related(). Also, for select_related(), you give the name of the field which is the foreign key, and *not* the name of the table. – Nick ODell Dec 19 '21 at 19:04
  • Found that in another stackoverflow question about the proper ordering of the tables. Thanks! If I revise query to something as simple as kidsquery = i_Eta.objects.select_related('mpn') then I get the error about unknown column i_eta.id in field list. How can I fix this issue? I feel it has to be addressed before figuring out how to output the other table's fields. – dukehenry Dec 19 '21 at 19:29
  • @dukehenry Do you have a primary key on that table? If not, Django will add one for you. – Nick ODell Dec 19 '21 at 22:54
  • I wasn't aware that you could set a primary_key value on a models.ForeignKey variable. So, once I did that and migrated it eliminated that issue. Still working on being able to retrieve the results from the other class, but have some documentation that may provide insight. Thanks! – dukehenry Dec 20 '21 at 01:28