1

I have inherited a table with information about some groups of people in which one field which contains delimited data, with the results matched to another table.

id_group     Name
-----------------------
1            2|4|5
2            3|4|6
3            1|2

And in another table I have a list of people who may belong to one or more groups

id_names     Names
-----------------------
1            Jack
2            Joe
3            Fred
4            Mary
5            Bill

I would like to perform a select on the group data which results in a single field containing a comma or space delimited list of names such as this from the first group row above "Joe Fred Bill"

I have looked at using a function to split the delimited string, and also looked at sub queries, but concatenating the results of sub queries quickly becomes huge.

Thanks!

GMB
  • 216,147
  • 25
  • 84
  • 135
Commsman
  • 23
  • 4
  • 3
    The best solution would be to redesign your table structure and not even attempt to proceed with this. – Shadow Mar 13 '19 at 22:40
  • Since your data is pipe-separated instead of comma-separated, you can't use `FIND_IN_SET()`. That's the most common solution, but not in your case. I understand you "inherited" this project and it's probably not up to you to redesign it, but you can curse your predecessor for leaving you with a badly designed app. – Bill Karwin Mar 13 '19 at 22:44
  • @billkarwin is it such a stretch to replace pipes with commas? – Strawberry Mar 13 '19 at 22:45
  • @Strawberry, `FIND_IN_SET()` does not support any alternative separator. It only supports commas. – Bill Karwin Mar 13 '19 at 22:46
  • plus, it may not just be a database but an entire system operating on the pipe delimited set. replacing with commas at this point could break something we dont know about – Kai Qing Mar 13 '19 at 22:48

2 Answers2

2

As implied by Strawberry's comment above, there is a way to do this, but it's so ugly. It's like finishing your expensive kitchen remodel using duct tape. You should feel resentment toward the person who designed the database this way.

SELECT g.id_group, GROUP_CONCAT(n.Names SEPARATOR ' ') AS Names
FROM groups AS g JOIN names AS n
  ON FIND_IN_SET(n.id_names, REPLACE(g.Name, '|', ','))
GROUP BY g.id_group;

Output, tested on MySQL 5.6:

+----------+---------------+
| id_group | Names         |
+----------+---------------+
|        1 | Joe Mary Bill |
|        2 | Fred Mary     |
|        3 | Jack Joe      |
+----------+---------------+

The complexity of this query, and the fact that it will be forced to do a table-scan and cannot be optimized, should convince you of what is wrong with storing a list of id's in a delimited string.

The better solution is to create a third table, in which you store each individual member of the group on a row by itself. That is, multiple rows per group.

CREATE TABLE group_name (
  id_group INT NOT NULL,
  id_name INT NOT NULL,
  PRIMARY KEY (id_group, id_name)
);

Then you can query in a simpler way, and you have an opportunity to create indexes to make the query very fast.

SELECT id_group, GROUP_CONCAT(names SEPARATOR ' ') AS names
FROM groups
JOIN group_name USING (id_group)
JOIN names USING (id_name)
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • I'm glad that someone's using a version of MySQL almost as old as mine :-p – Strawberry Mar 13 '19 at 23:00
  • Well, it's not technically EOL until February 2021. – Bill Karwin Mar 13 '19 at 23:01
  • Thanks this works for me. I also appreciate the comments from others regarding cr@p database design although it seems a bit pointless to say this when it has already been stated that this was inherited and not therefore in my power to change. – Commsman Mar 14 '19 at 12:31
  • We've all found ourselves in that position at one time or another. – Bill Karwin Mar 14 '19 at 15:46
2

Shadow is correct. Your primary problem is the bad design of relations in the database. Typically one designs this kind of business problems as a so-called M:N relation (M to N). To accomplish that you need 3 tables:

  • first table is groups that has a GroupId field with primary key on it and a readable name field (e.g. 'group1' or whatever)

  • second table is people that looks exactly as you showed above. (do not forget to include a primary key in the PeopleId field also here)

  • third table is a bridge table called GroupMemberships. That one has 2 fields GroupId and PeopleId. This table connects the first two with each other and marks the M:N relation. One group can have 1 to N members and people can be members of 1 to M groups.

Finally, just join together the tables in the select and aggregate:

SELECT 
    g.Name,
    GROUP_CONCAT(p.Name ORDER BY p.PeopleId DESC SEPARATOR ';') AS Members
FROM
    Groups AS g
    INNER JOIN GroupMemberships AS gm ON g.GroupId = gm.GroupId
    INNER JOIN people AS p ON gm.PeopleId = p.PeopleId
GROUP BY g.Name;
GMB
  • 216,147
  • 25
  • 84
  • 135
Tarek Salha
  • 307
  • 3
  • 12