2

We wish to perform a GROUP BY operation on a table. The original table contains an ARRAY column. Within a group, the content of these arrays should be transformed into a single array with unique elements. No ordering of these elements is required. contain Newest PostgreSQL versions are available.

Example original table:

  id | fruit  | flavors               
---: | :----- | :---------------------
     | apple  | {sweet,sour,delicious}
     | apple  | {sweet,tasty}         
     | banana | {sweet,delicious}    

Exampled desired result:

count_total | aggregated_flavors          
----------: | :---------------------------
          1 | {delicious,sweet} 
          2 | {sour,tasty,delicious,sweet}

SQL toy code to create the original table:

CREATE TABLE example(id int, fruit text, flavors text ARRAY);
INSERT INTO example (fruit, flavors)
VALUES ('apple', ARRAY [ 'sweet','sour', 'delicious']),
       ('apple', ARRAY [ 'sweet','tasty' ]),
       ('banana', ARRAY [ 'sweet', 'delicious']);

We have come up with a solution requiring transforming the array to s

SELECT COUNT(*) AS count_total,
       array
  (SELECT DISTINCT unnest(string_to_array(replace(replace(string_agg(flavors::text, ','), '{', ''), '}', ''), ','))) AS aggregated_flavors
FROM example
GROUP BY fruit

However we think this is not optimal, and may be problematic as we assume that the string does neither contain "{", "}", nor ",". It feels like there must be functions to combine arrays in the way we need, but we weren't able to find them.

Thanks a lot everyone!

S-Man
  • 22,521
  • 7
  • 40
  • 63
elke
  • 1,220
  • 2
  • 12
  • 24

1 Answers1

3

demo:db<>fiddle

Assuming each record contains a unique id value:

SELECT
    fruit,
    array_agg(DISTINCT flavor),      -- 2
    COUNT(DISTINCT id)               -- 3
FROM
    example,
    unnest(flavors) AS flavor        -- 1
GROUP BY fruit
  1. unnest() array elements
  2. Group by fruit value: array_agg() for distinct flavors
  3. Group by fruit value: COUNT() for distinct ids with each fruit group.

if the id column is really empty, you could generate the id values for example with the row_number() window function:

demo:db<>fiddle

SELECT
    * 
FROM (
    SELECT
        *, row_number() OVER () as id
    FROM example
) s,
unnest(flavors) AS flavor
S-Man
  • 22,521
  • 7
  • 40
  • 63
  • Great answer, works like a charm! I would have never gotten the idea to shift the unnest to the FROM section. As a bonus, this approach extends naturally to aggregation of multiple Array columns. – elke Aug 13 '19 at 06:21