4

I'm writing a polymorphic PL/pgSQL function that iterates over an array. I am interested in using FOREACH, however I cannot figure out how to declare a temporary variable with the right type.

My function is below, for more information see the comment on line 4.

CREATE OR REPLACE FUNCTION uniq(ary anyarray) RETURNS anyarray AS $$
DECLARE
  ret ary%TYPE := '{}';
  v ???; -- how do I get the element type of @ary@?
BEGIN
  IF ary IS NULL THEN
    return NULL;
  END IF;

  FOREACH v IN ARRAY ary LOOP
    IF NOT v = any(ret) THEN
      ret = array_append(ret, v);
    END IF;
  END LOOP;

  RETURN ret;
END;
$$ LANGUAGE plpgsql;
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Nate Symer
  • 2,185
  • 1
  • 20
  • 27

2 Answers2

4

Answer to primary question

AFAIK, you cannot declare a variable of a polymorphic type without a "template" variable or parameter.

There are related examples in the manual at the end of the chapter Declaring Function Parameters, but this trick is not covered: add another IN, INOUT or OUT parameter with data type ANYELEMENT to the function definition. It resolves to the matching element type automatically and can be (ab)used as variable inside the function body directly or as template for more variables:

CREATE OR REPLACE FUNCTION uniq1(ary ANYARRAY, v ANYELEMENT = NULL)
  RETURNS anyarray AS
$func$
DECLARE
   ret      ary%TYPE := '{}';
   some_var v%TYPE;  -- we could declare more variables now
                     -- but we don't need to
BEGIN
   IF ary IS NULL THEN
      RETURN NULL;
   END IF;

   FOREACH v IN ARRAY ary LOOP  -- instead, we can use v directly
      IF NOT v = any(ret) THEN
         ret := array_append(ret, v);
      END IF;
   END LOOP;

   RETURN ret;
END
$func$  LANGUAGE plpgsql;

Related:

Copying types like that only works in the DECLARE section and is different type casting. It is explained in the manual here.

Assign a default value, so the added parameter does not have to be included in the function call: ANYELEMENT= NULL

Call (unchanged):

SELECT uniq1('{1,2,1}'::int[]);
SELECT uniq1('{foo,bar,bar}'::text[]);

Better function

I would actually use an OUT parameter for convenience and invert the test logic:

CREATE OR REPLACE FUNCTION uniq2(ary ANYARRAY, elem ANYELEMENT = NULL
                               , OUT ret ANYARRAY)
  RETURNS anyarray AS
$func$
BEGIN
   IF ary IS NULL
      THEN RETURN;
      ELSE ret := '{}';  -- init
   END IF;

   FOREACH elem IN ARRAY ary LOOP
      IF elem = ANY(ret) THEN  -- do nothing
      ELSE
         ret := array_append(ret, elem);
      END IF;
   END LOOP;
END
$func$  LANGUAGE plpgsql;

But this still does not cover all cases containing NULL elements.

Proper function

To work for NULL elements as well:

CREATE OR REPLACE FUNCTION uniq3(ary ANYARRAY, elem ANYELEMENT = NULL
                               , OUT ret ANYARRAY)
  RETURNS anyarray AS
$func$
BEGIN
   IF ary IS NULL
      THEN RETURN;
      ELSE ret := '{}';  -- init
   END IF;

   FOREACH elem IN ARRAY ary LOOP
      IF elem IS NULL THEN  -- special test for NULL
         IF array_length(array_remove(ret, NULL), 1) = array_length(ret, 1) THEN
            ret := array_append(ret, NULL);
         END IF;
      ELSIF elem = ANY(ret) THEN  -- do nothing
      ELSE
         ret := array_append(ret, elem);
      END IF;
   END LOOP;
END
$func$  LANGUAGE plpgsql;

Checking for NULL in an array is a bit of a pain:

All of these functions are just proof of concept. I would use neither. Instead:

Superior solutions with plain SQL

In Postgres 9.4 use WITH ORDINALITY to preserve original order of elements. Detailed explanation:

Basic code for single value:

SELECT ARRAY (
   SELECT elem
   FROM  (
      SELECT DISTINCT ON (elem) elem, i
      FROM   unnest('{1,2,1,NULL,4,NULL}'::int[]) WITH ORDINALITY u(elem, i)
      ORDER  BY elem, i
      ) sub
   ORDER  BY i) AS uniq;

Returns:

uniq
------------
{1,2,NULL,4}

About DISTINCT ON:

Built into a query:

SELECT *
FROM   test t
     , LATERAL (
   SELECT ARRAY (
      SELECT elem
      FROM  (
         SELECT DISTINCT ON (elem) elem, i
         FROM   unnest(t.arr) WITH ORDINALITY u(elem, i)
         ORDER  BY elem, i
         ) sub
      ORDER BY i) AS arr
   ) a;

This has a tiny corner case: it returns an empty array a NULL array. To cover all bases:

SELECT t.*, CASE WHEN t.arr IS NULL THEN NULL ELSE a.arr END AS arr
FROM   test t
     , LATERAL (
   SELECT ARRAY (
      SELECT elem
      FROM  (
         SELECT DISTINCT ON (elem) elem, ord
         FROM   unnest(t.arr) WITH ORDINALITY u(elem, ord)
         ORDER  BY elem, ord
         ) sub
      ORDER BY ord) AS arr
   ) a;

Or:

SELECT *
FROM   test t
LEFT   JOIN LATERAL (
   SELECT ARRAY (
      SELECT elem
      FROM  (
         SELECT DISTINCT ON (elem) elem, i
         FROM   unnest(t.arr) WITH ORDINALITY u(elem, i)
         ORDER  BY elem, i
         ) sub
      ORDER BY i) AS arr
   ) a ON t.arr IS NOT NULL;

In Postgres 9.3 or older you can substitute with generate_subscripts():

SELECT *
FROM   test t
     , LATERAL (
   SELECT ARRAY (
      SELECT elem
      FROM  (
         SELECT DISTINCT ON (t.arr[i]) t.arr[i] AS elem, i
         FROM   generate_subscripts(t.arr, 1) i
         ORDER  BY t.arr[i], i
         ) sub
      ORDER  BY i
      ) AS arr
   ) a;

We need this in sqlfiddle, which currently only supports pg 9.3, so WITH ORDINALITY is not available:

SQL Fiddle.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • I like the WITH ORDINALITY part. That's exactly what I was looking for. – Nate Symer Nov 30 '15 at 22:05
  • really nice answer, i was looking for the same but couldn't find it anywhere else, it would be great if you can provide document link from where you got this--- > variable%TYPE. also it only works into declaration of variable i tried the same for typecasting but it didn't work – abhirathore2006 Jun 22 '16 at 13:44
  • @abhirathore2006: I added a link to the manual for `%TYPE` above. Typecasting is a different beast altogether. – Erwin Brandstetter Jun 22 '16 at 14:40
2

I do not know how to declare a variable of base type of an anyarray argument (the documentation makes no mention of such a possibility).

You can use FOR LOOP with integer variable instead:

CREATE OR REPLACE FUNCTION uniq(ary anyarray) RETURNS anyarray AS $$
DECLARE
  ret ary%TYPE := '{}';
  i int;
BEGIN
  IF ary IS NULL THEN
    return NULL;
  END IF;

  FOR i IN array_lower(ary, 1) .. array_upper(ary, 1) LOOP
    IF NOT ary[i] = any(ret) THEN
      ret = array_append(ret, ary[i]);
    END IF;
  END LOOP;

  RETURN ret;
END;
$$ LANGUAGE plpgsql;

However, loops and variables may not be necessary:

create or replace function uniq_without_loop(arr anyarray)
returns anyarray language plpgsql as $$
begin
    return (
        select array_agg(distinct elem)
        from unnest(arr) elem);
end $$;

The version of the above function that leaves the array order unchanged:

create or replace function unsorted_uniq_without_loop(arr anyarray)
returns anyarray language plpgsql as $$
begin
    return (
        select array_agg(elem)
        from (
            select elem 
            from (
                select distinct on(elem) elem, row_number() over ()
                from unnest(array[arr]) elem
                ) sub
            order by row_number
            ) sub);
end $$;
klin
  • 112,967
  • 15
  • 204
  • 232
  • I've tried your second solution. The problem is that the data I'm using must maintain order, but your second solution messes up order. – Nate Symer Nov 26 '15 at 03:35
  • Interestingly enough, someone submitted a patch just before version 9.4 of Postgres was frozen for release that implements `arr%elemtype`. – Nate Symer Nov 26 '15 at 03:38
  • I've added the version that doesn't sort the result. The idea of this special type makes sense. – klin Nov 26 '15 at 04:00