0

I have several lines of SQL code for my legacy database with GROUP_CONCAT statements, as in:

SELECT SUM(age), GROUP_CONCAT(sal) FROM Users;

In PostgreSQL, I can do the same with:

SELECT SUM(age), string_agg(sal, ', ') FROM Users;

I would like to reuse the old SQL as much as possible. So I need to define a GROUP_CONCAT function that internally calls string_agg. Is this possible?

EDIT: The linked question is unrelated!

My question asks "How to define a function called group_concat?". The linked question says "How can we do something equivalent to group concat (without defining a new function)?". None of the answers, also match what I want.

The selected answer is spot on! Thankfully it was answered before getting closed.

Jus12
  • 17,824
  • 28
  • 99
  • 157

3 Answers3

3

There is a string_agg() builtin which does what you want, but you specifically ask for it to be named group_concat for MySQL compatibility. Unfortunately, string_agg() uses an internal data type for accumulation (presumably to avoid copying the whole buffer on each append, I have not looked at the source though) and I didn't find a way to declare a SQL aggrerate identical to string_agg().

Defining group_concat() function would not work either, as pg has to be made aware that it is an aggregate, not a function with an aggregate hidden inside, which would not work. Such a function would operate on one row at a time: any aggregate inside would just aggregate a single row and return it unchanged...

Thus, this code will accumulate the elements into an array, then add the "," delimiters with array_to_string. I will use the array_agg() declaration (before it became a built-in) as a model, and simply add a finalizer function which will convert the aggregated array into text.

CREATE OR REPLACE FUNCTION _group_concat_finalize(anyarray)
RETURNS text AS $$
    SELECT array_to_string($1,',')
$$ IMMUTABLE LANGUAGE SQL;

CREATE AGGREGATE group_concat(anyelement) (
   SFUNC=array_append,
   STYPE=anyarray,
   FFUNC=_group_concat_finalize,
   INITCOND='{}'
);

SELECT group_concat(x) FROM foo;

The nice thing is that it should work fine for any type, without hassle, thanks to the generic types "anyarray" and "anyelement".

I would presume this would be slower than string_agg() if string_agg does indeed avoid to copy the whole aggregation array on each append. This should matter only if the number of rows to be grouped into each set is large, though. In this case, you probably can spend a minute editing the SQL query ;)

http://sqlfiddle.com/#!17/c452d/1

bobflux
  • 11,123
  • 3
  • 27
  • 27
  • Tested and works as expected! The key thing of this solution is that it works for any type! – Jus12 Dec 04 '17 at 18:06
  • The number of rows are less than 100. So this seems to be fine. – Jus12 Dec 05 '17 at 08:29
  • 1
    Why not aliasing the existing string_agg() function, using e.g. (this recipe)[https://stackoverflow.com/questions/40300728/function-alias-for-postgres-default-function]? – fralau Aug 22 '19 at 15:24
2

Yes this is possible. Have a look at https://github.com/2ndQuadrant/mysqlcompat/blob/master/sql_bits/aggregate.sql where it's already done, as such:

-- GROUP_CONCAT()
-- Note: only supports the comma separator
-- Note: For DISTINCT and ORDER BY a subquery is required
CREATE OR REPLACE FUNCTION _group_concat(text, text)
RETURNS text AS $$
  SELECT CASE
    WHEN $2 IS NULL THEN $1
    WHEN $1 IS NULL THEN $2
    ELSE $1 operator(pg_catalog.||) ',' operator(pg_catalog.||) $2
  END
$$ IMMUTABLE LANGUAGE SQL;

CREATE AGGREGATE group_concat (
    BASETYPE = text,
    SFUNC = _group_concat,
    STYPE = text
);
  • The linked code does not work as my columns return numeric type. I get the error `function group_concat(numeric) does not exist. Hint: No function matches the given name and argument types. You might need to add explicit type casts.`. – Jus12 Dec 04 '17 at 16:58
  • Then either add a group_concat() aggregate for numeric values if you really don't want to change your SQL, or cast the values to text in the SQL query. – Dimitri Fontaine Dec 04 '17 at 17:02
  • SQL query is in compiled code, so changing that is not an option. Will try the numeric type. I don't really understand the syntax at the link. Any references there? – Jus12 Dec 04 '17 at 17:03
  • Hi @DimitriFontaine, I wanted to improve my answer, so I thought either create a copy of string_agg with a new name (but there does not seem to be DDL for this, aggregate can be renamed but apparently not copied) or do CREATE AGGREGATE with the same parameters as used for creating string_agg, however string_agg uses an "internal" type as accumulator, so I don't think it is possible either... – bobflux Dec 04 '17 at 17:25
  • 1
    Thanks for the review, I added the referenced solution. @peufeu answer is better anyway, I upvoted it! – Dimitri Fontaine Dec 04 '17 at 17:49
0

The aggregate function can be modified to work with columns of any type. The result is of type text, though, in this version. It is possible to return the array type of the base type by leaving out the final function.

CREATE OR REPLACE FUNCTION _group_concat(state anyarray, value anyelement)
RETURNS anyarray AS $$
  SELECT CASE
    WHEN value IS NULL THEN state
    WHEN state IS NULL THEN array_append('{}', value)
    ELSE array_append (state, value)
  END
$$ IMMUTABLE LANGUAGE SQL;

CREATE OR REPLACE FUNCTION _group_concat_final(state anyarray) 
RETURNS TEXT AS $$
  SELECT array_to_string (state, ', '::text);
$$ IMMUTABLE STRICT LANGUAGE SQL;

CREATE AGGREGATE group_concat (
  BASETYPE = anyelement,
  SFUNC = _group_concat,
  FINALFUNC = _group_concat_final,
  STYPE = anyarray
);

The line WHEN value IS NULL THEN state makes the aggregate function ignore NULL values. If it is left out, NULL values are put into the array. In order to show them, add another argument to the call of array_to_string() with the value you would like to see in the output.

Using array_to_json() instead of array_to_string() is another nice way of creating the final output.

Holger Jakobs
  • 984
  • 3
  • 11
  • 32