how to make array_agg() work like group_concat() from mySQL

Question

So I have this table:

create table test (
   id integer, 
   rank integer,
   image varchar(30)
);

Then some values:

id | rank | image   
---+------+-------  
 1 |    2 | bbb  
 1 |    3 | ccc  
 1 |    1 | aaa  
 2 |    3 | c  
 2 |    1 | a  
 2 |    2 | b

I want to group them by id and concatenate the image name in the order given by rank. In mySQL I can do this:

  select id, 
         group_concat( image order by rank asc separator ',' ) 
    from test 
group by id;

And the output would be:

1 aaa,bbb,ccc
2 a,b,c

Is there a way I can have this in postgresql?

If I try to use array_agg() the names will not show in the correct order and apparently I was not able to find a way to sort them. (I was using postgres 8.4 )

Jason Weathered · Accepted Answer · 2010-11-01T02:12:22.723

49

In PostgreSQL 8.4 you cannot explicitly order array_agg but you can work around it by ordering the rows passed into to the group/aggregate with a subquery:

SELECT id, array_to_string(array_agg(image), ',')
FROM (SELECT * FROM test ORDER BY id, rank) x
GROUP BY id;

In PostgreSQL 9.0 aggregate expressions can have an ORDER BY clause:

SELECT id, array_to_string(array_agg(image ORDER BY rank), ',')
FROM test
GROUP BY id;

edited Nov 01 '10 at 02:12

answered Oct 31 '10 at 04:00

Jason Weathered

7,762
2
41
37

Actually if I add more rows to my test table this solution does not work. Surprisingly enough is not working for me even if I load the table in the correct order. However your solution works in postgresql 8.4 only if I sort the subquery by both id and rank. weird if you ask me – user491575 Oct 31 '10 at 08:38
The workaround is a bit hacky and it appears to show for you. I suspect there's a query plan difference causing this. I'm somewhat curious as to what your `EXPLAIN` output is with the bad result. In any case, I've updated my answer with `ORDER BY id, rank`. – Jason Weathered Nov 01 '10 at 02:11
First I want to thank you for the answer. It does solve the problem I had. – user491575 Nov 01 '10 at 15:31
Second running a query on a small test set seem to be irrelevant. For example, I added 2 indexes to our test table , one on id, one on rank, and it works fine. On my real data though I still have to order by both id and rank like I said. And take a look at the plans for both cases: – user491575 Nov 01 '10 at 15:36
1

plan for test with 2 indexes: HashAggregate (cost=1.29..1.40 rows=7 width=82) -> Sort (cost=1.17..1.19 rows=7 width=86) Sort Key: test.rank -> Seq Scan on test (cost=0.00..1.07 rows=7 width=86) – user491575 Nov 01 '10 at 15:36
I have the same 2 indexes on my real data as well but the plan is different here: GroupAggregate (cost=41962.39..42622.69 rows=200 width=328) -> Sort (cost=41962.39..42181.16 rows=87507 width=328) Sort Key: x.id -> Subquery Scan x (cost=0.00..8452.33 rows=87507 width=328) -> Index Scan using tt2_idx on sequence (cost=0.00..7577.26 rows=87507 width=89) – user491575 Nov 01 '10 at 15:37
As I suspected, the real data uses a different query plan. With a small amount of data, it's actually quicker to just scan the whole table than to use the index. This is reflected in your `EXPLAIN` output. p.s. Something to take note of when benchmarking, etc is to run an `ANALYZE` on your table after making data changes to make sure the statistics are updated. – Jason Weathered Nov 08 '10 at 02:07
6

In Postgres 9.0 you don't need `array_agg` at all. You can use `string_agg(image, ',' order by rank)` – Apr 11 '14 at 14:10

how to make array_agg() work like group_concat() from mySQL

1 Answers1

Linked