52

In Sqlite I can use group_concat to do:

1...A
1...B
1...C
2...A
2...B
2...C

1...C,B,A
2...C,B,A

but the order of the concatenation is random - according to docs.

I need to sort the output of group_concat to be

1...A,B,C
2...A,B,C

How can I do this?

user230781
  • 545
  • 1
  • 4
  • 5
  • An upstream discussion: https://sqlite.org/forum/info/a49d9c4083b5350c PostgreSQL has an easy syntax for it: https://stackoverflow.com/questions/43870/how-to-concatenate-strings-of-a-string-field-in-a-postgresql-group-by-query – Ciro Santilli OurBigBook.com Sep 19 '22 at 19:15

4 Answers4

73

Can you not use a subselect with the order by clause in, and then group concat the values?

Something like

SELECT ID, GROUP_CONCAT(Val)
FROM (
   SELECT ID, Val
   FROM YourTable
   ORDER BY ID, Val
   )
GROUP BY ID;
Community
  • 1
  • 1
Adriaan Stander
  • 162,879
  • 31
  • 289
  • 284
  • http://stackoverflow.com/questions/9833091/using-order-by-caluse-inside-group-concat-function-in-sqlite – confucius Mar 23 '12 at 01:54
  • 8
    Note that this will **not** work with MySQL. (No one said it did, but I made the mistake of assuming that was the case) – Lars Nyström Sep 02 '14 at 15:09
  • 3
    [The docs](https://www.sqlite.org/lang_aggfunc.html#groupconcat) stated : The order of the concatenated elements is arbitrary. Thought your answer can work, but I doubt it's order is guarantied by the docs. See also [another answer](https://stackoverflow.com/a/23783461/5197544). – Кое Кто Jul 12 '18 at 16:06
18

To be more precise, according to the docs:

The order of the concatenated elements is arbitrary.

It does not really mean random, it just means that the developers reserve the right to use whatever ordering they whish, even different ones for different queries or in different SQLite versions.

With the current version, this ordering might be the one implied by Adrian Stander's answer, as his code does seem to work. So you might just guard yourself with some unit tests and call it a day. But without examining the source code of SQLite really closely you can never be 100% sure this will always work.

If you are willing to build SQLite from source, you can also try to write your own user-defined aggregate function, but there is an easier way.

Fortunately, since version 3.25.0, you have window functions, providing a guaranteed-to-work, although somewhat ugly solution to your problem.

As you can see in the documentation, window functions have their own ORDER BY clauses:

In the example above, the window frame consists of all rows between the previous row ("1 PRECEDING") and the following row ("1 FOLLOWING"), inclusive, where rows are sorted according to the ORDER BY clause in the window-defn (in this case "ORDER BY a").

Note, that this alone would not necessarily mean that all aggregate functions respect the ordering inside a window frame, but if you take a look at the unit tests, you can see this is actually the case:

do_execsql_test 4.10.1 {
  SELECT a, 
    count() OVER (ORDER BY a DESC),
    group_concat(a, '.') OVER (ORDER BY a DESC) 
  FROM t2 ORDER BY a DESC
} {
  6 1 6
  5 2 6.5
  4 3 6.5.4
  3 4 6.5.4.3
  2 5 6.5.4.3.2
  1 6 6.5.4.3.2.1
  0 7 6.5.4.3.2.1.0
}

So, to sum it up, you can write

SELECT ID, GROUP_CONCAT(Val) OVER (PARTITION BY ID ORDER BY Val) FROM YourTable;

resulting in:

1|A
1|A,B
1|A,B,C
2|A
2|A,B
2|A,B,C

Which unfortunately also contains every prefix of your desired aggregations. Instead you want to specify the window frames to always contain the full range, then discard the redundant values, like this:

SELECT DISTINCT ID, GROUP_CONCAT(Val)
OVER (PARTITION BY ID ORDER BY Val ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM YourTable;

or like this:

SELECT * FROM (
    SELECT ID, GROUP_CONCAT(Val)
    OVER (PARTITION BY ID ORDER BY Val ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
    FROM YourTable
)
GROUP BY ID;
szmate1618
  • 1,545
  • 1
  • 17
  • 21
0

Stumbling upon the underlying sorting-problem I tried this: (... on 10.4.18-MariaDB)

select GROUP_CONCAT(ex.ID) as ID_list
FROM (
SELECT usr.ID
FROM (
SELECT u1.ID as ID
FROM table_users u1
) usr
GROUP BY ID
) ex

... and found the serialized ID_list ordered! But I don't have an explanation for this now "correct" (?) result.

Jaybear
  • 11
  • 4
0

ALTERNATIVE solution: use recursion instead of GROUP_CONCAT. For the demo, here's a WORKGROUPS table:

create table Workgroups as 
select 1 as workgroup,'Daniel' as name union all
select 2,'Marc' union all
select 3,'Chris' union all
select 3,'Evelyn' union all
select 2,'Valentine' union all
select 1,'John' union all
select 3,'Luca' union all
select 2,'Thomas' union all
select 3,'Harry' union all
select 4,'Tom' union all
select 4,'Marilyn' union all
select 1,'Ben' union all
select 2,'Ann';

Now, I prepare a Ranked view in a WITH clause, using two rankings: rk for the general order of the output, in order to allow recursive chaining. rk2 to later identify the LAST occurrence of each group as rank 1. The trick resides in the Iif clause, which clears the aggregated chain whenever a group break occurs.

with Ranked as (
select workgroup,
  rank() over (order by workgroup,name) as rk,
  rank() over (partition by workgroup order by name desc) as rk2,
  name 
 from Workgroups
),Recursed as ( --follows the initial part
 select Ranked.workgroup,Ranked.rk,Ranked.rk2, Ranked.name as names from Ranked 
  where Ranked.rk=1 
   union all --follows the recursion part
 select Ranked.workgroup,Ranked.rk,Ranked.rk2, 
   iif(Recursed.workgroup=Ranked.workgroup,names || ', ','') || Ranked.name as names 
 from Recursed 
  join Ranked on Recursed.rk+1=Ranked.rk 
)
select workgroup,names from Recursed where rk2=1;

Looks bit clumsy, but pretty elegant, is it not ? Try it yourself.

The advantage over GROUP_CONCAT may be that within the Iif clause, you can do whatever you wish: complex separators, decide what to do with empty (Null) entries, process multiple fields, even build HTML code etc.

Comments welcome !

Pelton
  • 17
  • 2