3

Long time reader, first-time question...er.

I'm refactoring and upgrading some incredibly old PHP and MySQL queries (through PDO) at the moment, and have come across a very complex ordering requirement that I can't get right.

I have a table listing titles. Let's say they're books. I need to get a list of all those titles out in order to list them under different tabs according to their first letter. The ordering requirements are as follows:

  1. Ignore leading 'the' (not necessary for 'a' or 'an')
  2. Ignore leading punctuation
  3. Place numbers last

E.g.

  • A Tale of Two Cities
  • The Adventures of Tom Sawyer
  • Charlie and the Chocolate Factory
  • @hotmail.com: A Terror Story from 2002
  • 'No' - Standing Your Ground
  • The Picture of Dorian Gray
  • War and Peace
  • The 8th Dwarf
  • 1000 Years of Annoying the French

All three are easy. 1 & 2 I can do together without any trouble - albeit with a mucky list of specific punctuation, rather than a catch-all (regex maybe?). 3 I can do alone thanks to previous question 'ORDER BY alphabet first then follow by number'; but I can't get all three working at the same time.

Implementing requirements 1 and 2, the query as it stands is as follows:

SELECT title FROM books WHERE discontinued=0 ORDER BY TRIM(LEADING " " FROM TRIM(LEADING "!" FROM TRIM(LEADING "@" FROM TRIM(LEADING "'" FROM TRIM(LEADING "(" FROM TRIM(LEADING "..." FROM TRIM(LEADING "%" FROM TRIM(LEADING "The " FROM title)))))))) ASC, year_of_publication ASC
  • Firstly and primarily, can anyone help me get all 3 working at the same time?
  • Secondly, bearing in mind that 'The' and all the punctuation is left in tact in the title values that are returned to my PHP, can anyone suggest a nice way to then split up the records as I iterate through them in PHP, in order to populate the 27 A-Z and # tabs? A virtual column listing the first character that SQL is sorting on, perhaps? (The code I'm refactoring achieved this by running 27 queries (!), for titles starting with each letter in turn - not something I want to do if I can avoid it!)
  • Thirdly, is there a neater way of trimming leading punctuation from the title than listing them individually? (Not so important, but if it makes nicer/more efficient code...)
  • Fourth and finally, if anyone could throw in a way to order those numbers 1, 2, 3 ... 10 rather than 1, 10, 2, 3, that would also be appreciated. (Again, I've seen this 'natural sort' implemented in simpler queries, but not with something like this. It's not important, but nice if it's possible alongside all the rest of my requirements!)

Any thoughts, help or advice in solving this problem would be greatly appreciated and gratefully received! Happy, of course, to try going about it a different way if that's what it takes.

double-beep
  • 5,031
  • 17
  • 33
  • 41
Aaron
  • 43
  • 6
  • 1
    Possible duplicate of [Natural Sort in MySQL](http://stackoverflow.com/questions/153633/natural-sort-in-mysql) – Matt S May 05 '16 at 20:35
  • 1
    I would suggest you make a separate column ie `sort_title`, do all your transformations on that column on insert or update and make sure it has a full text index on it. What you are trying to do with each query is very expensive CPU and Memory wise. – cmorrissey May 05 '16 at 20:47
  • 1
    To simplify things a little, since you want things like "The 8th" filed with the numbers, I would make your massive trim a SELECTed field so you can use it's alias in the results; for example: `ORDER BY normalizedTitle REGEXP "^[0-9]", normalizedTitle` – Uueerdo May 05 '16 at 21:07

1 Answers1

0

As soon as you order by a function rather than a field all opportunities to use an index go out the window.
I would suggest you not follow this route. Instead add an extra field to the table where you trim out the parts that you don't want to order/filter on and use that to select.

SELECT title FROM books WHERE discontinued=0   
ORDER BY trimmed_title ASC, year_of_publication ASC  

When inserting data in the table you can add a trigger to the table that does the trimming for you

delimiter ||

CREATE TRIGGER bi_books BEFORE INSERT ON books
  FOR EACH ROW
  BEGIN
    #include whatever trimming you want here.
    set NEW.trimmed_title = TRIM(LEADING "The " FROM NEW.title);
  END;
||

delimiter ;

See here for info on natural sort for numbers: Natural Sort in MySQL

Community
  • 1
  • 1
Johan
  • 74,508
  • 24
  • 191
  • 319