MySQL - select distinct values from a column where values are separated by comma

Question

To get the list of distinct values from table1 and column1 is as easy as doing this:

SELECT distinct(column1) 
FROM table1

However, I inherited (unfortunately) a database where column1 contains values separated by a comma

column1
--------
row 1: name1,name2
row 2: name2,name3
row 3: name4,name1,name3

I need to get the list of distinct values from column1, so it looks like this:

column1
--------
name1
name2
name3
name4

Any ideas?

please read https://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad mysql can't do this naturally you have to use functions or stored procedures for that — nbk, Oct 04 '20 at 12:37
Please, please, read about database normalization. And please avoid comma-separated values in columns whenever possible. They systematically defeat SQL query optimizations. — O. Jones, Oct 04 '20 at 13:06
please please stop talking about DB normalization, I can't fix problems someone else created... plus, someone already posted a very elegant solution to this issue — jjj, Oct 04 '20 at 13:07

score 3 · Accepted Answer · answered Oct 04 '20 at 12:57

A generic method uses a recursive query (available in MySQL 8.0 only):

with recursive 
    data as (select concat(column1, ',') rest from mytable),
    words as (
        select substring(rest, 1, locate(',', rest) - 1) word, substring(rest, locate(',', rest) + 1) rest
        from data
        union all
        select substring(rest, 1, locate(',', rest) - 1) word, substring(rest, locate(',', rest) + 1) rest
        from words
        where locate(',', rest) > 0
)
select distinct word from words order by word

Demo on DB Fiddle

Sample data:

| column1           |
| :---------------- |
| name1,name2       |
| name2,name3       |
| name4,name1,name3 |

Results:

| word  |
| :---- |
| name1 |
| name2 |
| name3 |
| name4 |

omg, such an elegant solution and it worked like a charm too... thank you — jjj, Oct 04 '20 at 13:03
on my server, it took about 10s to extract and sort distinct values from 160k rows...very livable,...now that I have those values, I can easily improve my app — jjj, Oct 04 '20 at 13:11

score 1 · Answer 2 · answered Oct 04 '20 at 12:47

You have to split them apart. If you have at most three names in the column, then one method is:

select substring_index(column1, ',', 1) as name
from t
union -- on purpose to remove duplicates
select substring_index(substring_index(column1, ',', 2), ',', -1) as name
from t
where name like '%,%'
union -- on purpose to remove duplicates
select substring_index(substring_index(column1, ',', 3), ',', -1) as name
from t
where name like '%,%,%';

score 0 · Answer 3 · answered Oct 04 '20 at 12:34

0

You can't. Your database does not adhere to the the first principle in designing normalized databases :- Atomicity. It says to store one and only one attribute in a column and yet you have so many. You need to retrieve the entire columns value, split and de-dupe them from your application. SQL cannot do this for you.

What you really need to do here is to have a seperate NAMES table and apply DISTINCT on name column after filtering relevant rows.

answered Oct 04 '20 at 12:34

Nilesh PS

356
3
8

It's not my database, so whomever not adhered to normalized principles, it wasn't me. In any case, I still have a problem at hand I need to solve in some way. Thanks for your answer though. – jjj Oct 04 '20 at 12:38
A simple "no" would not have helped a reader understand the problem fully hence I had to explain what's wrong here. I have also mentioned this can only be done by the application. I don't understand why the answer was still downvoted. – Nilesh PS Oct 04 '20 at 12:42

MySQL - select distinct values from a column where values are separated by comma

3 Answers3

Linked

Related