-1

Here is an example of data:

'1.'    'Numeric types'
'1.1.'  'Integer'
'1.2.'  'Float'
...
'1.10'  'Double'

To naturally sort it we can use string_to_array with '.' as separator, then cast text[] to int[] and sort by integer array, but since the field itself is of type text and there might be cases where user decides to use non-numeric symbols, e.g. 1.1.3a, thus causing cast error. To address that I decided to use regexp:

select regexp_matches('1.2.3.4.', E'(?:(\\d+)\.?)+')

Expected result is array: {'1', '2', '3', '4'} but instead i get only the last element of the said array, however, if I use following regexp:

select regexp_matches('1.2.3.4.', E'((?:\\d+)\.?)+')

The result is {'1.2.3.4.'}.

Using global-flag 'g' is not an option, because regexp_matches returns a column.

Is there any way to convert '1.2.3.4a.'::text to {1, 2, 3 ,4}::int[] using only one regexp_matches?

Fiddle.

Nick
  • 123
  • 8

1 Answers1

2

You can use the global 'g' flag with regexp_matches, but needs to aggregate values to an array (most simple with the array() constructor):

select array(select m[1] from regexp_matches(dt_code, '(\d+)', 'g') m)::int[] nums, *
from data_types
order by 1;

Or, you can split your string to array with string_to_array(), but you still need to use regexp to remove any non-numeric characters:

select string_to_array(trim(regexp_replace(dt_code, '[^\d\.]+', ''), '.'), '.')::int[] nums, *
from data_types
order by 1;

For a more advanced natural-like sorting, you need to split your text to tokens yourself. See more info at the related SO question.

I could come up with a simplified, reusable function:

create or replace function natural_order_tokens(text)
  returns table (
    txt text,
    num int,
    num_rep text
  )
  language sql
  strict
  immutable
as $func$
  select m[1], (case m[2] when '' then '0' else m[2] end)::int, m[2]
    from regexp_matches($1, '(\D*)(\d*)', 'g') m
   where m[1] != '' or m[2] != ''
$func$;

With this function, natural sorting will be this easy:

select *
from data_types
order by array(select t from natural_order_tokens(dt_code) t);

SQLFiddle

Community
  • 1
  • 1
pozs
  • 34,608
  • 5
  • 57
  • 63
  • Did you even bother to read the question? You simply suggest every approach I stated in my question as inapplicable. Also the question was how to do it using only one `regexp_matches`, and you ignored that as well. What is the point of posting answer again? – Nick Dec 18 '14 at 21:16
  • @Nick yes, I read it. None of your approaches work in your question (as you stated), but these are working solutions. Also the *using only one `regexp_matches`* also in this answer (that is the first one). You couldn't do it without the `g` modifier. – pozs Dec 19 '14 at 08:36
  • Then the answer is **"No, you can not do it only with a single `regexp_matches` call"**. – Nick Dec 19 '14 at 10:11
  • @Nick but i could do it with a single *call* ... (using `g` does not call `regexp_matches` multiple times) – pozs Dec 19 '14 at 10:24
  • So you like to remove words from the question to make it easier. I will try once again: **ONLY with a single regexp_matches call**, meaning no other function calls. – Nick Dec 21 '14 at 02:20
  • @Nick what other functions do you see in my answer? `array()` and `array[]` are the array constructors, they are language features, not function calls http://www.postgresql.org/docs/current/static/sql-expressions.html#SQL-SYNTAX-ARRAY-CONSTRUCTORS – pozs Jan 05 '15 at 10:01