How to access array internal index with postgreSQL?

Question

This is my (perhaps usual for you) non-optimized solution:

Workaround for PG problem with non-optimized internal function:

CREATE FUNCTION unnest_with_idx(anyarray)
RETURNS TABLE(idx integer, val anyelement) AS
$$ 
   SELECT generate_series(1,array_upper($1,1)) as idx, unnest($1) as val;
$$ LANGUAGE SQL IMMUTABLE;

Test:

SELECT idx,val from unnest_with_idx(array[1,20,3,5]) as t;

But, as I said, non-optimized. I can't believe (!!) that PostgreSQL doesn't have an internal index for arrays ... ? But in this case, the question is how to directly access this index, where the GIN-like internal counter?

NOTE1: the solution above and the question is not the same as "how do you create an index by each element of an array?". Also not the same as "Can PostgreSQL index array columns?" because the function is for an isolated array, not for a table index for array fields.

NOTE2 (edited after answers): "array indexes" (more popular term) or "array subscripts" or "array counter" are terms that we can use in a semantic path to refer the "internal counter", the accumulator to the next array item. I see that no PostgreSQL command offer a direct access to this counter. As generate_series() function, the generate_subscripts() function is a sequence generator, and the performance is (best but) near the same. By other hand row_number() function offers a direct access to a "internal counter of rows", but it is about rows, not about arrays, and unfortunately the performance is worse.

Erwin Brandstetter · Accepted Answer · 2022-04-25T22:42:16.327

Postgres 9.4 or later

While operating with 1-dimensional arrays and standard index subscripts (like almost always), use the new WITH ORDINALITY instead:

SELECT t.*
FROM   unnest(ARRAY[1,20,3,5]) WITH ORDINALITY t(val, idx);

See:

PostgreSQL unnest() with element number

Just make sure you don't trip over non-standard subscripts. See:

Normalize array subscripts so they start with 1

Postgres 9.3 or earlier

(Original answer.)

Postgres does provide dedicated functions to generate array subscripts:

WITH   x(a) AS (VALUES ('{1,20,3,5}'::int[]))
SELECT generate_subscripts(a, 1) AS idx
     , unnest(a) AS val
FROM   x;

Effectively it does almost the same as @Frank's query, just without subquery.
Plus, it also works for subscripts that do not start with 1.

Either solution works for 1-dimensional arrays only! (Can easily be expanded to multiple dimensions.)

Function:

CREATE OR REPLACE FUNCTION unnest_with_idx(anyarray) 
  RETURNS TABLE(idx integer, val anyelement)
  LANGUAGE sql IMMUTABLE AS
$func$
  SELECT generate_subscripts($1, 1), unnest($1);
$func$;

Call:

SELECT * FROM unnest_with_idx('{1,20,3,5}'::int[]);

Also consider:

SELECT * FROM unnest_with_idx('[4:7]={1,20,3,5}'::int[]);

About custom array subscripts:

Normalize array subscripts so they start with 1

To get normalized subscripts starting with 1 for a 1-dimensional array:

SELECT generate_series(1, array_length($1,1)) ...

That's almost the query you had already, just with array_length() instead of array_upper() - which would fail with non-standard subscripts.

Performance

I ran a quick test on an array of 1000 int with all queries presented here so far. They all perform about the same (~ 3,5 ms) - except for row_number() on a subquery (~ 7,5 ms) - as expected, because of the subquery.

Thanks! Well, there are two new solutions after *generate_series()*... What is the fasted: *row_number()* or *generate_subscripts()*? — Peter Krauss, Sep 03 '12 at 15:45
@PeterKrauss It depends on your use case (I mean test it with your data). `generate_subscripts()` is much more readable for me. I won't expect much difference when used on arrays of normal size, however. — dezso, Sep 03 '12 at 16:03
@PeterKrauss: I added an alternative solution, plus results of a quick performance test. — Erwin Brandstetter, Sep 03 '12 at 16:33
Ok, comparing `explain analyse` times with the "SELECT 1, unnest($1)" time, I have subscripts=109%, series=111%, and row_number 150%. Your solution with `generate_subscripts()` is the best (!), and my solution with `generate_series()` is not so wrong. — Peter Krauss, Sep 03 '12 at 18:15
To reader: the `WITH ORDINALITY` is perfect! In pg9.5 it works fine also with JSON and JSONB arrays! `SELECT * FROM jsonb_array_elements( '[20,11,3,5]'::JSONB ) WITH ORDINALITY` — Peter Krauss, May 13 '16 at 18:37

score 1 · Answer 2 · edited Sep 03 '12 at 14:14

1

row_number() works:

SELECT 
    row_number() over(), 
    value
FROM (SELECT unnest(array[1,20,3,5])) a(value);

Then, the optimized function will be

CREATE OR REPLACE FUNCTION unnest_with_idx(anyarray) 
RETURNS table(idx integer, val anyelement) AS $$ 
  SELECT (row_number() over())::integer as idx, val
  FROM (SELECT unnest($1)) a(val);
$$ LANGUAGE SQL IMMUTABLE;

edited Sep 03 '12 at 14:14

Peter Krauss

13,174
24
167
304

answered Sep 03 '12 at 12:27

Frank Heikens

117,544
24
142
135

Is it guaranteed to keep the order? – Quassnoi Sep 03 '12 at 12:30
@Quassnoi: I didn't test it, and without a sort order, there is no guarantee – Frank Heikens Sep 03 '12 at 12:31
Yes! I am testing, and I think array order is always preserved. Thank a lot, it is the solution. – Peter Krauss Sep 03 '12 at 13:53
@PeterKrauss: I've approved your edit. Note that you could also answer you own question ([see this thread](http://meta.stackexchange.com/questions/17463/can-i-answer-my-own-questions-even-those-where-i-knew-the-answer-before-asking)) – JMax Sep 03 '12 at 14:16
Wouldn't this fail for arrays with non-standard subscripts like `'[4:7]={1,20,3,5}'::int[]`? – Erwin Brandstetter Sep 03 '12 at 14:46
1

Sorry, I changed my "chosen answer": my benchmark of percentual times with row_number (150%), generate_subscripts (109%) and generate_series (111%) demonstrates that row_number is the worst (!), and that generate_subscripts the faster, then it near to me need of "internal array index" and optimization. – Peter Krauss Sep 03 '12 at 18:21

How to access array internal index with postgreSQL?

2 Answers2

Postgres 9.4 or later

Postgres 9.3 or earlier

Performance

Linked