0

Let's say I have a table of movies. The name of the movie is a varchar. But sometimes the movie names contain sequel (or series or episode) numbers that go over 9. How can I sort the movie name to handle numbers like a human would?

This is how it sorts by default:

select * from movies order by name

Alpha
Beta
Gamma 1
Gamma 10
Gamma 2
Gamma 3
Gamma 4
Gamma 5
Gamma 6
Gamma 7
Gamma 8
Gamma 9
Delta

See the 10 after 1? I know why it does that, what I want to know is how to make it sort like this:

Alpha
Beta
Gamma 1
Gamma 2
Gamma 3
Gamma 4
Gamma 5
Gamma 6
Gamma 7
Gamma 8
Gamma 9
Gamma 10
Delta

I have looked for solutions all over and haven't found anything that solves this.

Please note that this is just a simple example, the actual names can be much more complex, like Gamma The Explorer 1. episode or Gamma The Explorer 2. series 1. episode or Gamma The Explorer 3. episode Director's Cut and that most other movie names don't have numbers in them, so please no solutions that count on the number being in a specific place or position in the string or sorting using name+0 - that's not the solution I want.

I know for a fact that Windows Explorer, among other software, handles sorting file names in the correct (human) fashion, and I don't believe MySQL wouldn't be able to handle something so obvious, so there ought to be a feasible solution for this.

Edit: Apparently it's called natural sorting. Does MySQL really not have this at all? Optionally: What is UDF, and where do I get one for natural sorting, and how do I get it on my server?

Zdeněk Gromnica
  • 854
  • 2
  • 14
  • 31
  • 2
    Related mysql feature request: http://bugs.mysql.com/bug.php?id=1588 – Pekka Mar 07 '11 at 16:11
  • Thanks, I didn't see that one. – Zdeněk Gromnica Mar 07 '11 at 16:16
  • A UDF is like a stored function, but it's written in C rather than in SQL, and has to be linked into the MySQL server when you set it up. For the purpose of nat-sorting strings, I don't believe there's any advantage to using a UDF over a regular stored function, except *perhaps* execution speed. To do a true nat-sort you actually need a custom *comparison* function, which isn't possible short of significantly changing the MySQL source code. But you can get away with a hack, whereby you create a sort key corresponding to your string, that attempts to encode numbers so they sort right. – Doin Sep 29 '19 at 11:34
  • Unfortunately (and surprisingly, given how common the need for nat-sort is) the solutions available until now that made use of this kind of hack have overall been rather poor in code quality and limited in flexibility. I've just posted what I think is a very big improvement over all the existing code (that I've found on stackoverflow and elsewhere, anyway). It can certainly be used to sort movie names the way you'd want. Check it out here: https://stackoverflow.com/a/58154535/999120 – Doin Sep 29 '19 at 11:42

1 Answers1

0
  SELECT movies.*, 
         SUBSTRING_INDEX(name, ' ', 1) AS name_str,
         SUBSTRING_INDEX(name, ' ', -1) + 0 AS name_num
    FROM movies
ORDER BY name_str, name_num
hsz
  • 148,279
  • 62
  • 259
  • 315
  • Thanks, but that's not what I'm looking for. What I listed was just a simple example. The actual names can be much more complex, like `Gamma The Explorer 1. episode` or `Gamma The Explorer 2. series 1. episode` or `Gamma The Explorer 3. episode Director's Cut` and most other movie names don't have numbers in them. – Zdeněk Gromnica Mar 07 '11 at 16:29