0

I have a column in database:

Serial Number
-------------
S1
S10
...
S2
S11
..
S13

I want to sort and return the result as follows for serial number <= 10 :

S1
S2
S10

One way I tried was:

select Serial_number form table where Serial_Number IN ('S1', 'S2',... 'S10');

This solves the purpose but looking for a better way

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
iDev
  • 2,163
  • 10
  • 39
  • 64
  • 1
    For a general solution to this problem, see http://stackoverflow.com/questions/12965463/humanized-or-natural-number-sorting-of-mixed-word-and-number-strings . It may be unnecessarily complicated if your values are simpler and more consistent. – Craig Ringer Mar 27 '14 at 02:10

3 Answers3

3

Here is an easy way for this format:

order by length(Serial_Number),
         Serial_Number

This works because the prefix ('S') is the same length on all the values.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thanks didnt realize that it would work for the same prefix. But is there a way that I could limit it till a certain value? Say till S5 or S10? – iDev Mar 26 '14 at 21:54
  • @iDev . . . I don't really understand what you mean. `limit` takes a number of rows, not a value. – Gordon Linoff Mar 27 '14 at 00:56
2

For Postgres you can use something like this:

select serial_number
from the_table
order by regexp_replace(serial_number, '[^0-9]', '', 'g')::integer;

The regexp_replace will remove all non-numeric characters and the result is treated as a number which is suited for a "proper" sorting.

Edit 1:

You can use the new "number" to limit the result of the query:

select serial_number
from (
  select serial_number, 
         regexp_replace(serial_number, '[^0-9]', '', 'g')::integer as snum
  from the_table
) t 
where snum <= 10
order by snum;

Edit 2

If you receive the error ERROR: invalid input syntax for integer: "" then apparently you have values in the serial_number column which do no follow the format you posted in your question. It means that regexp_replace() remove all characters from the string, so a string like S would cause that.

To prevent that, you need to either exclude those rows from the result using:

where length(regexp_replace(serial_number, '[^0-9]', '', 'g')) > 0

in the inner select. Or, if you need those rows for some reason, deal with that in the select list:

select serial_number
from (
  select serial_number, 
         case
            when length(regexp_replace(serial_number, '[^0-9]', '', 'g')) > 0 then regexp_replace(serial_number, '[^0-9]', '', 'g')::integer as snum
            else null -- or 0 whatever you need
         end as snum
  from the_table
) t 
where snum <= 10
order by snum;

This is a really nice example on why you should never mix two different things in a single column. If all your serial numbers have a prefix S you shouldn't store it and put the real number in a real integer (or bigint) column.

Using something like NOT_SET to indicate a missing value is also a bad choice. The NULL value was precisely invented for that reason: to indicate the absence of data.

  • Why do we need flag 'g'? Also, is it possible to limit the resultset till say S5 or S10? – iDev Mar 26 '14 at 21:58
  • 1
    @iDev: just to be sure. If there were other non-numeric characters after the beginning, regexp_replace() with the `g` parameter wouldn't replace them. It make the replace "global" rather than just the firs occurance –  Mar 26 '14 at 22:00
  • Thanks @a_horse_with_no_name, but I am not sure how could the result only have till S10? – iDev Mar 26 '14 at 22:08
  • ERROR: invalid input syntax for integer: "" – iDev Mar 26 '14 at 22:29
  • @iDev: which means you have rows in there that do not contain any number, e.g. `S` –  Mar 26 '14 at 22:31
  • Just checked, yes I do have some rows with 'NOT_SET' – iDev Mar 26 '14 at 22:35
1

Since only the first character spoils your numeric fun, just trim it with right() and sort by the numeric value:

SELECT *
FROM   tbl
WHERE  right(serial_number, -1)::int < 11
ORDER  BY right(serial_number, -1)::int;

Requires Postgres 9.1 or later. In older versions substitute with substring (x, 10000).

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228