0

I want to sort a field alphanumerically in the database. It turns out to be trickier than I thought. This is just example values, the content can vary, but I hope it's enough to get the idea.

I want to sort this list:

11
01
1
1A
01B
20a
01a
20
1b
2b
02a

Like this:

1
01
1A
01a
1b
01B
02a
2b
11
20
20a

Note that the relative ordering of equivalent numbers with and without leading zeroes is not important, it can be 1 01 or 01 1.

I've tried CAST(field AS UNSIGNED) but it doesn't work. Ideas?

Barmar
  • 741,623
  • 53
  • 500
  • 612
SeaBass
  • 1,584
  • 4
  • 20
  • 46
  • What are your rules for sorting ? eg, why does 2b come before both 11 and 20 ? – racraman Jun 10 '19 at 23:35
  • @racraman Looks like it's sorting by the numeric prefix first, then the length of the numeric part (to handle leading zeroes), then the alphabetic suffix. – Barmar Jun 10 '19 at 23:36
  • Because the letters are different versions of the digits. So 2 comes first, 2a and 2b are versions of that. In other words they should come before 11 and 20. – SeaBass Jun 10 '19 at 23:37
  • 1
    OK, so '11' and '20' are eleven and twenty, so come after 2. So in summary, you're looking to split the field into 3 substrings (a: optional leading zero(s), b: number as sequence of digits, c:optional letter suffix), then sort by a,(b converted to integer),c - yes? – racraman Jun 10 '19 at 23:49
  • So...sometimes 2 characters count as 1 character for sorting? Can these character modifying characters occur in the middle of the string, or only at the end? – Ben Jun 10 '19 at 23:49
  • Yes, 01, 00001 and 1 are all before 2, as 02 is before 3 etc. and if you have the same string but a letter at the end, the letter should come last: 120 is before 120A – SeaBass Jun 10 '19 at 23:53
  • Are you using MySQL 8.0? You can use the `REGEXP_SUBSTR()` function to extract the numeric prefix and alphabetic suffix. – Barmar Jun 10 '19 at 23:58
  • If not, you'll need to write a function to do it, as there are no built-in functions to find the position of a character set, you can only look for specific strings. – Barmar Jun 10 '19 at 23:59
  • It says `mysqlnd 5.0.12-dev - 20150407` from `phpinfo()` so I guess not? – SeaBass Jun 11 '19 at 00:02
  • Is it the leading zeroes that makes it tricky? As long as 01 and 1 comes before 2, they can come in any order among themselves if that helps at all. 01, 1 or 1, 01 – SeaBass Jun 11 '19 at 00:04
  • That's the PHP MySQL driver version, not the version of MySQL. – Barmar Jun 11 '19 at 00:07
  • Oops, is there an easy way I can find out? Not good at the server stuff. It's an Ubuntu server at Digital Ocean that someone helped me set up. – SeaBass Jun 11 '19 at 00:08
  • Execute the query: `SELECT version();` – Barmar Jun 11 '19 at 00:10
  • @Barmar Thanks, no luck :( `5.7.18-0ubuntu0.16.04.1` – SeaBass Jun 11 '19 at 00:14
  • So, there are no possibility that the alphabetical character occurring in the beginning or in the middle of the value? like 'A1' or '0A1'? – FanoFN Jun 11 '19 at 02:07
  • 1
    Potentially, but 99.5% of the time no so it’s fine if that’s a dealbreaker. – SeaBass Jun 11 '19 at 02:20

1 Answers1

2

If you're using MySQL 8.0 or higher, you can use REGEXP_SUBSTR(colname, '[a-z]+$') to get the alphabetic suffix, and CAST(colname AS UNSIGNED) to get the numeric prefix. Then you can sort by these.

SELECT code
FROM yourTable
ORDER BY CAST(code AS UNSIGNED), REGEXP_SUBSTR(code, '[a-z]+$')

See What is the equivalent of REGEXP_SUBSTR in mysql? for how to get similar functionality in earlier versions of MySQL.

Another possibility is:

ORDER BY CAST(code AS UNSIGNED), TRIM(LEADING '0' FROM code)
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Thanks! That looks nice and simple. Can anyone help me translate that into older MySQL? I checked out the link, but not sure where to start. – SeaBass Jun 11 '19 at 00:31
  • We're not here to do your work for you. If you can't do it yourself, you need to hire a more competent programmer. – Barmar Jun 11 '19 at 00:32
  • I didn't mean to sound ungrateful. I've spent 20 minutes on the page you linked, but just doesn't know how to do it. I was hoping that this would have been an easy `CAST()` thing. I just created folders named the exact same names as I listed in my original post, guess what, Mac OS X is sorting it exactly the way I want as default, so I'm surprised that there isn't an easy way to do it in MySQL. Check out this screenshot: [link](https://www.dropbox.com/s/hhc34h119led6el/sorting.png?dl=0) – SeaBass Jun 11 '19 at 00:47
  • There's no easy solution for this. Did you see the answer that contains a link to a `PREG_CAPTURE()` user-defined function? – Barmar Jun 11 '19 at 00:50
  • Yes, sorry I just don't know what to do with it. I found this post. It looks like it's called "natural sorting" what I'm after. https://stackoverflow.com/a/12257917/3234322 – SeaBass Jun 11 '19 at 01:00
  • Natural sorting is useful when you want to sort by letters first, then numbers. You're going the other way. – Barmar Jun 11 '19 at 01:01
  • I might just go for 'CAST(field AS UNSIGNED), field'. It doesn't solve the leading zeroes, but most of the time it will be one or the other anyway so I can live with it. Thanks again for the help. – SeaBass Jun 11 '19 at 02:22
  • @SeaBass I tried that, but it didn't give the desired result. – Barmar Jun 11 '19 at 02:24
  • You could try `CAST(field AS UNSIGNED), TRIM(LEADING '0' FROM field)` to ignore the leading zeroes. – Barmar Jun 11 '19 at 02:25
  • 1
    I've tried that @Barmar but there are a few places where the ordering is off. For example, the first row should return '1' then '01' but instead with `ORDER BY CAST(field AS UNSIGNED), TRIM(LEADING '0' FROM field)`, it returns '01' first then '1'. I did however successfully order the results as OP's specified ASSUMING that the alphabetical characters only occurred at the end of every value .. using this `ORDER BY CAST(code AS UNSIGNED),RIGHT(code,1),LENGTH(code), TRIM(LEADING '0' FROM code)` .. adding to your original ordering suggestion – FanoFN Jun 11 '19 at 02:46
  • @tcadidot0 He said the relative ordering of `1` and `01` doesn't matter. – Barmar Jun 11 '19 at 02:47
  • @tcadidot0 It was in a comment, I just copied it into the question. – Barmar Jun 11 '19 at 02:49
  • Oook, sorry, I clearly missed that. In that case my suggestion is just to cater the exact OP's desired result only. – FanoFN Jun 11 '19 at 02:51
  • 1
    @tcadidot0 Awesome! Thank you! – SeaBass Jun 11 '19 at 03:44