SQL group by distinct array

Question

I have table1:

col1 (integer)  |  col2 (varchar[]) | col3 (integer)    
----------------------------------------------------
    1           |  {A,B,C}          | 2
    1           |  {A}              | 5
    1           |  {A,B}            | 1
    2           |  {A,B}            | 2
    2           |  {A}              | 3
    2           |  {B}              | 1

I want summarize 'col3 ' with a GROUP BY 'col1 ' by keeping only DISTINCT values from 'col3 '

Expected result below :

col1 (integer)  |  col2 (varchar[]) | col3 (integer)    
----------------------------------------------------
    1           |  {A,B,C}          | 8
    2           |  {A,B}            | 6

I tried this :

SELECT col1, array_to_string(array_accum(col2), ','::text),sum(col3) FROM table1 GROUP BY col1

but the result is not the one expected :

col1 (integer)  |  col2 (varchar[])       | col3 (integer)    
---------------------------------------------------------------
    1           |  {A,B,C,A,A,B}          | 8
    2           |  {A,B,A,B}              | 6

do you have any suggestion?

Wha is the logic behind you wanting {A,B,C} for 1 and {A,B} for 2? — Matt, Mar 16 '16 at 10:58
@Matt I guess just distinct values. OP, is it possible to use array_accum(distinct col2)? — sagi, Mar 16 '16 at 11:00

score 0 · Answer 1 · answered Mar 16 '16 at 11:08

If the logic of which col2 you want is by the largest (like in your expected output is {A,B,C} & {A,B}.

SELECT col1, (SELECT sub.col2 
              FROM table1 sub
              INNER JOIN table1 sub ON MAX(char_length(sub.col2)) = col2
              WHERE sub.col1 = col1)
       SUM(col3) 
FROM table1 
GROUP BY col1

score 0 · Answer 2 · edited May 23 '17 at 11:45

SELECT 
col1, 
array_to_string(array_accum(col2), ','::text),
sum(col3) 
FROM table1 
GROUP BY col1;

but array_to_string concatenates array elements using supplied delimiter and optional null string.

You have to devise a different strategy like using array_dims(anyarray) to select the array with max elements or create a new aggregation function.

For this you could be interested in this answer:

eliminate duplicate array values in postgres

SQL group by distinct array

2 Answers2