167

If I use array_agg to collect names, I get my names separated by commas, but in case there is a null value, that null is also taken as a name in the aggregate. For example :

SELECT g.id,
       array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END) canonical_users,
       array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END) non_canonical_users
FROM groups g
GROUP BY g.id;

it returns ,Larry,Phil instead of just Larry,Phil (in my 9.1.2, it shows NULL,Larry,Phil).

Instead, if I use string_agg(), it shows me only the names (without empty commas or nulls).

The problem is that I have Postgres 8.4 installed on the server, and string_agg() doesn't work there. Is there any way to make array_agg work similar to string_agg() ?

Stephen
  • 8,508
  • 12
  • 56
  • 96
Daud
  • 7,429
  • 18
  • 68
  • 115
  • See this PostgreSQL mailing list thread on much this topic: http://postgresql.1045698.n5.nabble.com/array-agg-NULL-Handling-td2798942.html – Craig Ringer Oct 29 '12 at 14:16
  • I am sorry, I don't think there is a solution in that thread.. – Daud Oct 29 '12 at 14:33
  • There are two solutions in that thread. One is to create a function and the other (just suggested not shown) is the one I answered. – Clodoaldo Neto Oct 29 '12 at 14:37
  • @Clodoaldo - all the rows will have canonical in ('y','n')... so the where clause seems to be redundant. The problem is that inside a grouping, if the the value of the canonical field is 'Y', and we are collecting 'N's, then a null be collected too.. – Daud Oct 29 '12 at 14:44
  • Ok. Now I got it. Check the update answer. – Clodoaldo Neto Oct 29 '12 at 14:51
  • @Daud That mailing list thread mainly explains that there *isn't* a really nice, simple solution and that you have to filter the elements going into `array_agg` for null if you don't want nulls. It's providing background and context, not an answer. – Craig Ringer Oct 29 '12 at 23:43
  • I'm wondering if I should put together a patch for a null-ignoring `array_agg` variant, since it's so ugly to manually ignore nulls. – Craig Ringer Oct 29 '12 at 23:45
  • Thanks for all that. array_to_string works for me – Daud Oct 30 '12 at 05:57

9 Answers9

376

With postgresql-9.3 one can do this;

SELECT g.id,
   array_remove(array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END), NULL) canonical_users,
   array_remove(array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END), NULL) non_canonical_users
FROM groups g 
GROUP BY g.id;

Update: with postgresql-9.4;

SELECT g.id,
   array_agg(g.users) FILTER (WHERE g.canonical = 'Y') canonical_users,
   array_agg(g.users) FILTER (WHERE g.canonical = 'N') non_canonical_users
FROM groups g 
GROUP BY g.id;

Update (2022-02-19): also with postgresql-9.4;

This results in an empty array when all values in an array are null instead of returning null;

SELECT g.id,
  coalesce( array_agg(g.users) FILTER (WHERE g.canonical = 'Y'), '{}' ) canonical_users,
  coalesce( array_agg(g.users) FILTER (WHERE g.canonical = 'N'), '{}' ) non_canonical_users
FROM groups g 
GROUP BY g.id;
Dale O'Brien
  • 4,108
  • 2
  • 15
  • 10
  • 7
    This works and is fast and elegant, it solved me a problem similar to the OP's. A reason to upgrading to 9.3 for those who didn't do it yet. +1 – Pavel V. Aug 29 '14 at 08:41
  • 16
    The 9.4 is even more elegant. Works like a charm – jmgarnier Sep 08 '15 at 13:41
  • 3
    The 9.4 variant is even better, because what I need to filter away in my case is the nulls. – coladict May 12 '17 at 12:01
  • I used the updated version first, but then realised I needed to remove Nulls and duplicates, so went back to the first suggestion. It's a large query, but it's to create a materialized view, so not a huge issue. – Relequestual Sep 12 '18 at 11:00
  • 4
    In case it isn't obvious, for other situations you can do something like `array_agg(col_to_aggregate) FILTER (WHERE col_to_aggregate IS NOT NULL)` if you just want to apply the filter directly to that col. – Stephen Aug 25 '21 at 13:47
51

If you are looking for a modern answer to the general question of how to remove a NULL from an array, it is:

array_remove(your_array, NULL)

I was specifically curious about performance and wanted to compare this to the best possible alternative:

CREATE OR REPLACE FUNCTION strip_nulls(
    IN array_in ANYARRAY
)
RETURNS anyarray AS
'
SELECT
    array_agg(a)
FROM unnest(array_in) a
WHERE
    a IS NOT NULL
;
'
LANGUAGE sql
;

Doing a pgbench test proved (with high confidence) that array_remove() is a little more than twice as fast. I did my test on double precision numbers with a variety of array sizes (10, 100 and 1000 elements) and random NULLs in between.


It's also worth noting that this can be used to remove blanks ('' != NULL). But the second parameter accepts anyelement, and since it is most likely they you'd be indicating a blank with a string literal, make sure to cast it to the form you want, usually a non-array.

For example:

select array_remove(array['abc', ''], ''::text);

If you try:

select array_remove(array['abc', ''], '');

it will assume that the '' is TEXT[] (array) and will throw this error:

ERROR: malformed array literal: ""

Alexi Theodore
  • 1,177
  • 10
  • 16
  • @VivekSinha what version of postgres are you using? I just tested your query and it resulted in "{1,2,3}" for me. I am using 12.1. – Alexi Theodore Aug 13 '20 at 02:43
  • 1
    Ah, I see @alexi-theodore what's happening at my end. I was using a custom+modified postgres driver. When I query directly in console, I can see the right output! Sorry about the confusion. Deleted prior comment and upvoted answer! – Vivek Sinha Aug 14 '20 at 03:26
  • 1
    Probably it is helpful to note that array_remove is supported since 9.3 – Anatoly Rugalev Oct 04 '20 at 12:48
38
select
    id,
    (select array_agg(a) from unnest(canonical_users) a where a is not null) canonical_users,
    (select array_agg(a) from unnest(non_canonical_users) a where a is not null) non_canonical_users
from (
    SELECT g.id,
           array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END) canonical_users,
           array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END) non_canonical_users
    FROM groups g
    GROUP BY g.id
) s

Or, simpler and may be cheaper, using array_to_string which eliminates nulls:

SELECT
    g.id,
    array_to_string(
        array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END)
        , ','
    ) canonical_users,
    array_to_string(
        array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END)
        , ','
    ) non_canonical_users
FROM groups g
GROUP BY g.id
Stephen
  • 8,508
  • 12
  • 56
  • 96
Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
15

In solving the general question of removing nulls from array aggregates there are two main ways of attacking the problem: either doing array_agg(unnest(array_agg(x)) or creating a custom aggregate.

The first is of the form shown above:

SELECT 
    array_agg(u) 
FROM (
    SELECT 
        unnest(
            array_agg(v)
        ) as u 
    FROM 
        x
    ) un
WHERE 
    u IS NOT NULL;

The second:

/*
With reference to
http://ejrh.wordpress.com/2011/09/27/denormalisation-aggregate-function-for-postgresql/
*/
CREATE OR REPLACE FUNCTION fn_array_agg_notnull (
    a anyarray
    , b anyelement
) RETURNS ANYARRAY
AS $$
BEGIN

    IF b IS NOT NULL THEN
        a := array_append(a, b);
    END IF;

    RETURN a;

END;
$$ IMMUTABLE LANGUAGE 'plpgsql';

CREATE AGGREGATE array_agg_notnull(ANYELEMENT) (
    SFUNC = fn_array_agg_notnull,
    STYPE = ANYARRAY,
    INITCOND = '{}'
);

Calling the second is (naturally) a little nicer looking than the first:

select array_agg_notnull(v) from x;

Community
  • 1
  • 1
rorycl
  • 1,344
  • 11
  • 19
9

I am adding this even though this this thread is quite old, but I ran into this neat trick that works quite well on small arrays. It runs on Postgres 8.4+ without additional libraries or functions.

string_to_array(array_to_string(array_agg(my_column)))::int[]

The array_to_string() method actually gets rid of the nulls.

ced-b
  • 3,957
  • 1
  • 27
  • 39
6

You should wrap your array_agg with array_remove.

SELECT g.id,
       array_remove(array_agg(CASE WHEN g.canonical = 'Y' THEN g.users ELSE NULL END), NULL) canonical_users,
       array_remove(array_agg(CASE WHEN g.canonical = 'N' THEN g.users ELSE NULL END), NULL) non_canonical_users
FROM groups g
GROUP BY g.id;
Lulu
  • 438
  • 5
  • 15
3

As has been suggested in the comments you can write a function to replace nulls in an array, however as also pointed out in the thread linked to in the comments, this kind of defeats the efficiency of the aggregate function if you have to create an aggregate, split it then aggregate it again.

I think keeping nulls in the array is just a (perhaps unwanted) feature of Array_Agg. You could use subqueries to avoid this:

SELECT  COALESCE(y.ID, n.ID) ID,
        y.Users,
        n.Users
FROM    (   SELECT  g.ID, ARRAY_AGG(g.Users) AS Users
            FROM    Groups g
            WHERE   g.Canonical = 'Y'
            GROUP BY g.ID
        ) y
        FULL JOIN 
        (   SELECT  g.ID, ARRAY_AGG(g.Users) AS Users
            FROM    Groups g
            WHERE   g.Canonical = 'N'
            GROUP BY g.ID
        ) n
            ON n.ID = y.ID

SQL FIDDLE

GarethD
  • 68,045
  • 10
  • 83
  • 123
  • Thanks. But I needed 'case' to handle rows within a given grouping, and subqueries would be inefficient there – Daud Oct 29 '12 at 15:15
0

It is very simple, just first of all create a new - (minus) operator for text[]:

CREATE OR REPLACE FUNCTION diff_elements_text
    (
        text[], text[] 
    )
RETURNS text[] as 
$$
    SELECT array_agg(DISTINCT new_arr.elem)
    FROM
        unnest($1) as new_arr(elem)
        LEFT OUTER JOIN
        unnest($2) as old_arr(elem)
        ON new_arr.elem = old_arr.elem
    WHERE old_arr.elem IS NULL
$$ LANGUAGE SQL IMMUTABLE;

CREATE OPERATOR - (
    PROCEDURE = diff_elements_text,
    leftarg = text[],
    rightarg = text[]
);

And simply subtract the array[null]:

select 
    array_agg(x)-array['']
from
    (   select 'Y' x union all
        select null union all
        select 'N' union all
        select '' 
    ) x;

That's all:

{Y, N}

Miklos
  • 1
  • 1
  • 6
    `array_agg(x) FILTER (WHERE x is not null)` seems a lot easier: https://dbfiddle.uk/?rdbms=postgres_12&fiddle=92da5321389fa18265aa415072377b98 and you don't really need your own function, you can simply use `array_remove()` https://dbfiddle.uk/?rdbms=postgres_12&fiddle=4149f45f6304623720ac51a4cefb918a –  Oct 29 '19 at 12:56
0

I did an array_except, the concept is array_except(x,y).

array_except(array_agg(x), array_agg(case when x is null then x end))
Tyler2P
  • 2,324
  • 26
  • 22
  • 31
Casey
  • 1