Using Sum of Group Concat in Having clause

Question

I want to find all ids of which sum of group_concat is 0 . Here is simplified table from mine.

╔════╦════╦══════════════╦
║ id ║ did║ group_concat ║
╠════╬════╬══════════════╬
║  1 ║  1 ║ 1,1,1        ║
║  2 ║  1 ║ 0            ║
║  3 ║  2 ║ 1,-1         ║
║  4 ║  2 ║ 1,-1,0       ║
║  5 ║  2 ║ 0,0,0        ║
║  6 ║  3 ║ 2,-2         ║
║  7 ║  3 ║ 1,-1,0       ║
║  8 ║  3 ║ 0,0,0        ║
╚════╩════╩══════════════╩

I want to get when sum of group_concat is 0 in the same dids. If sum of group concat in any of dids is not equal to zero, it shouldn't be on the table.

Here is the table below for better understanding.

╔═════╦═════════════════════╦
║ did ║ sum of group_concat ║
╠═════╬═════════════════════╬
║  2  ║ 0                   ║
║  3  ║ 0                   ║
╚═════╩═════════════════════╩

And this is the query statement I am trying to use.

select sum(val)
from user
group by did
having sum(val) = 0

seems sum in group_concat is not available.

is there any efficient way?

thank you in advance

`SUM(GROUP_CONCAT(val))` makes no sense. I think you just mean `SUM(val)` — Barmar, Mar 16 '20 at 22:25
@Barmar thank you for your opinion. but I would like to avoid getting the second row. If I use only Sum(val), it gets (id = 2) as well — , Mar 16 '20 at 22:31
There are two problems: You can't nest aggregate functions. And `SUM()` expects the argument to be a number, but `1,1,1` is not a number. It will convert it to a number, and ignore everything after the comma. — Barmar, Mar 16 '20 at 22:38
@Barmar isn't each 1,1,1 a number? how do I sum up every value in group_concat? val column is just in the first table but I removed to show cuz to simplify — , Mar 16 '20 at 22:41
Yes, but together they're a string. There's nothing that adds up the numbers in a comma-separated string. — Barmar, Mar 16 '20 at 22:42
As stated in one of the answers `SUM()` adds numbers from multiple rows, not numbers in the same string. — Barmar, Mar 16 '20 at 22:42

Gordon Linoff · Answer 1 · 2020-03-16T23:53:26.183

1

Just use sum():

select sum(val), group_concat(val)
from user
group by did
having sum(val) = 0

Edit:

Do you also want a condition such as:

having sum(val) = 0 and max(val) > 0

edited Mar 16 '20 at 23:53

answered Mar 16 '20 at 22:26

Gordon Linoff

1,242,037
58
646
786

thank you for your feedback. but if I only use sum(val) in having, it also displays the second row (id = 2) – Mar 16 '20 at 22:30

Uueerdo · Answer 2 · 2020-03-16T22:42:14.670

0

SUM and all other aggregation functions are for calculating across groups of rows; not summing multiple values within a field.

Having multiple values within a field is an anti-pattern, and possibly one of the worst you can have for usability in a relational database.

There is no easy way to do what you want in a query; SQL just wasn't designed for those kinds of operations. ("Those operations" being parsing a comma-delimited string of integer values into a list of integers.) For a pure SQL answer, the best you could hope for would be a stored function that takes a string and uses procedural logic to parse the values out of the string and then sum them.)

edited Mar 16 '20 at 22:42

answered Mar 16 '20 at 22:37

Uueerdo

15,723
1
16
21

It seems that he's changed the question. Originally it was `GROUP_CONCAT(val)` so it was concatenating across multiple rows. – Barmar Mar 16 '20 at 22:39
Yes, it is unclear from the question as none of his presented data actually has individual `val` data points. – Uueerdo Mar 16 '20 at 22:43

score 0 · Accepted Answer · answered Mar 16 '20 at 22:46

One solution involves unnesting the csv lists with a table of numbers:

select
    t.did,
    sum(substring_index(substring_index(grp_concat, ',', n.n), ',', - 1)) sum_of_grp_concat
from mytable t
inner join (select 1 n union all select 2 union all select 3 union all select 4 union all select 5) n
    on n.n <= 1 + char_length(grp_concat) - char_length(replace(grp_concat, ',', ''))
group by t.did
having sum_of_grp_concat = 0

The query joins the original table with the derived number table, and extracts each individual value using string functions; all that is left to is aggregate and filter with the having clause.

Demo on DB Fiddle:

did | sum_of_grp_concat
--: | ----------------:
  2 |                 0
  3 |                 0

This task would be far more easier if you were to fix your schema in order to store each csv value in a separate table row. Storing csv lists in a database column is a typical SQL antipattern: on this topic, I would recommend reading this famous SO answer.

Using Sum of Group Concat in Having clause

3 Answers3