0

Currently I'm trying to take a column from my database and trying to sort it so that the numbers that are in the column get ordered sequentially, and then the letters with numbers follow after. When I use the following statement

SELECT * 
FROM term_size_ref 
WHERE 1=1 
ORDER BY LPAD(term_size, 10000, '0');

I get the following result where after the letters with numbers come in, the ordering becomes mixed again.

Is there a way that I can have all the numbers in order before displaying the letters? (example 0.025 -> 1.0 -> 400 -> A3)

Result of the query:


0.025
0.045
0.25
0.90
0.1
0.9
4
12
13
22
040
45
50
070
90
A1
B1
M8
RH
W1
W2
W3
1.0
1.1
1.6
1.8
100
110
187
2.3
2.4
250
3.0
4.8
400
630
8.0
800
9.5

Heroice18
  • 1
  • 2

1 Answers1

0

Hmm, when the non numeric part is always a prefix and the other part some string that is a valid representation of a number (in decimal with optional sign) or an empty string -- your sample data suggests that --, you can first sort by the prefix lexicographically and then by the value of the number. To do that you can use regexp_replace() to remove the number or the prefix respectively.

SELECT *
       FROM term_size_ref
       ORDER BY regexp_replace(term_size,
                               '[\d.\-+]+$',
                               '') ASC,
                nullif(regexp_replace(term_size,
                                     '^[^\d.\-+]+',
                                     ''),
                       '')::decimal ASC
                                    NULLS FIRST;

db<>fiddle

If possible you might also consider to split the data and have the prefix in its own column.

sticky bit
  • 36,626
  • 12
  • 31
  • 42
  • Problem is that the query still groups the numbers by the first value. So 1, 2, 3, 4, 10, 11, 21 ,22 will be printed out as 1, 10, 11, 2, 21, 22, 3, 4 – Heroice18 Dec 22 '21 at 17:25
  • @Heroice18: No it doesn't. It sorts the second part by its *numeric value*, **not** lexicographically. See: https://dbfiddle.uk/?rdbms=postgres_10&fiddle=f991fd038cf32f4143622a398c0c5da9 – sticky bit Dec 22 '21 at 18:38
  • It seems like the query will fail if the "::decimal" is part of the query. As it seems like it fails because it encounters text instead of numbers? When I remove it the query returns the numbers out of order. For example 1.6, 1.8, 10, 100, 1000, 1111111111, 12, 187, 2.6, 2.8, 9, 9.5, 90, 1.0/2.4 – Heroice18 Dec 22 '21 at 20:48
  • @Heroice18: Of course is does then. I placed the cast there for a purpose, that is exactly to fore a numeric context, so that the ordering is done numerically. – sticky bit Dec 22 '21 at 20:51
  • So then how do I resolve the "invalid input syntax for type numeric: "" " error? Do I need to put in a CAST? Or what do I need to do to make the query to work? – Heroice18 Dec 22 '21 at 21:24
  • @Heroice18: I did miss that there were values in your sample data that don't have a numeric suffix, so that would be empty. We can nullify it using `nullif()`. I edited the answer. – sticky bit Dec 22 '21 at 21:50
  • So in my data set (I didn't post the entirety of it because it is a lot of data). It seems to be failing on "TEST121Again" Where the error is ERROR: invalid input syntax for type numeric: "121Again". Do I need to add another case for having letters before numbers and then having numbers after? – Heroice18 Dec 22 '21 at 22:18
  • @Heroice18: Yes, you need to handle the non number suffix then as well. Where in the order should it be used? As the last one? – sticky bit Dec 22 '21 at 22:20
  • Yeah I think it would be good as the last one – Heroice18 Dec 22 '21 at 22:32
  • How would I do that? In my testing there seems to be the same error when it encounters special characters, for example * _ - , / – Heroice18 Dec 23 '21 at 16:25