0

What I mean is, I have table with a "list" column. The data that goes into the "list" is related to addresses, so I sometimes get repeated zip codes for one record in that field.

For example, "12345,12345,12345,12456".

I want to know if it's possible to construct a query that would find the records that have an unknown string that duplicates within the field, such that I would get the records like "12345,12345,12345,12456", but not ones like "12345,45678,09876".

I hope that makes sense.

RiggsFolly
  • 93,638
  • 21
  • 103
  • 149
Ginger
  • 29
  • 4
  • Are you looking to return results for all rows that have a duplicate zip code? (Your example has 3 duplicates then another value of 12456 which is unclear) – Anthony Bird Feb 10 '22 at 12:20
  • Start by familiarising yourself with all the possible useful [MySQL string Functions](https://dev.mysql.com/doc/refman/8.0/en/string-functions.html) – RiggsFolly Feb 10 '22 at 12:23
  • 2
    If it is not too late, you may want to re-think your schema and not store lists as delimited strings, and instead store them in a separate table which makes tasks like this very simple - [Is storing a delimited list in a database column really that bad?](https://stackoverflow.com/q/3653462/1048425) – GarethD Feb 10 '22 at 12:36
  • 1
    Saving data like this its not best practice at all. save these sub lists in different table. Stored procedures can help you to receive the correct results. – 4EACH Feb 10 '22 at 12:53

2 Answers2

1

Yes, it is possible. You need to use a numbers table to convert your delimited string into rows, then use group by to find duplicates, e.g.

CREATE TABLE T (ID INT, List VARCHAR(100));
INSERT INTO T (ID, List)
VALUES (1, '12345,12345,12345,12456'), (2, '12345,45678,09876');

SELECT
  T.ID,
  SUBSTRING_INDEX(SUBSTRING_INDEX(T.list, ',', n.Number), ',', -1) AS ListItem
FROM T
    INNER JOIN
    (   SELECT 1 AS Number UNION ALL
        SELECT 2 UNION ALL 
        SELECT 3 UNION ALL
        SELECT 4 UNION ALL 
        SELECT 5
    ) AS n 
        ON CHAR_LENGTH(T.list)-CHAR_LENGTH(REPLACE(T.list, ',', ''))>=n.Number-1
GROUP BY T.ID, ListItem
HAVING COUNT(*) > 1;

If you don't have a numbers table you can create one in a derived query as I have above with UNION ALL

Example on DB Fiddle

With that being said, this is almost certainly not the right way to store your data, you should instead use a child table, e.g.

CREATE TABLE ListItems
(
      MainTableId INT NOT NULL, --Foreign Key to your current table
      ItemName VARCHAR(10) NOT NULL -- Or whatever data type you need
);

Then your query is much more simple:

SELECT T.ID, li.ItemName
FROM    T
        INNER JOIN ListItems AS li
            ON li.MainTableId = T.ID
GROUP BY T.ID, li.ItemName
HAVING COUNT(*) > 1;

If you need to recreate your original format, this is easily done with GROUP_CONCAT():

SELECT T.ID, 
        GROUP_CONCAT(li.ItemName) AS List
FROM    T
        INNER JOIN ListItems AS li
            ON li.MainTableId = T.ID
GROUP BY T.ID;

Example on DB Fiddle

GarethD
  • 68,045
  • 10
  • 83
  • 123
-1

I am still unclear what your desired result is based on your question however if it is simply to get all rows where there is a duplicate entry in column list you could do the following:

SELECT * FROM TABLE
WHERE COLUMN IN
   (SELECT COLUMN FROM TABLE 
   having count(*) >1)
Anthony Bird
  • 200
  • 2
  • 7