1

I recently dumped the records from an old CRM into a Navicat (MySQL) database as two tables.

__________________
| table_1:       |
|----------------|
| id             |
| delimited_col  |
------------------

__________________
| table_2:       |
|----------------|
| id             |
| name           |
------------------

Info:

  • Table 1:
    • table_1.delimited_col consists of concatenated names delimited by ";"

      ex of a record: "kevin;nancy;james;bart;delaney"

  • Table 2:
    • table_2.name consists of single names

      ex of a record: "kevin"

My Goal:

  • Create a join table (table_3) which checks each record in table_1.delimited_col to see if they contain a value from table_2.name.
    • If a record in table_1.delimited contains a substring found in table_2.name, add that record from table_1's id and the matching record from table_2's id to the join table_3.
__________________
| join_table:    |
|----------------|
| id             |
| table_1_id     |
| table_2_id     |
------------------

I've been bashing my head against the wall trying to figure this out and would appreciate any help or advice!

GMB
  • 216,147
  • 25
  • 84
  • 135
micomec
  • 71
  • 8

1 Answers1

2

You can use string method find_in_set():

select t1.id table_1_id, t2.id table_2_id
from table_1 t1
inner join table_2 t2 on find_in_set(t2.name, replace(t1.delimited_col, ';', ','))

But better yet, you should really fix your data model and store each value in the delimited list in a separate table row. Storing csv data in relational databases is not a good practice, and might hurt you in many ways. Here is a famous SO post that goes into details on this topic - actually, lots of questions related to delimited lists get closed as a duplicate to this post here on SO.

If you want to create a new table with the results of the query, then:

create table table_3(id int auto_increment, table_1_id, table_2_id);

insert into table_3 (table_1_id, table_2_id)
select t1.id, t2.id
from table_1 t1
inner join table_2 t2 on find_in_set(t2.name, replace(t1.delimited_col, ';', ','));
GMB
  • 216,147
  • 25
  • 84
  • 135
  • Thanks so much for the advice on find_in_set(). And as for table design, couldn't agree more. The import is a dump from lessAnnoyingCRM, my limited SQL experience had me dying on the inside when I saw that column. – micomec Jan 29 '20 at 17:11
  • After trying this, I am only getting one record as a result. It seems to be looking for an exact string match between the tables, rather than searching for a substring in table_1. The one table_1 record returned has an exact string match to a record in table_2 Any suggestions? – micomec Jan 29 '20 at 17:25
  • 1
    @micomec: what is the delimiter in your CSV list: a comma (`,`) or a semi-column (`;`)? – GMB Jan 29 '20 at 17:31
  • The delimiter was a semi-colon (`;`). – micomec Jan 29 '20 at 17:35
  • 1
    @micomec: ok. find_in_set() wants commas - I fixed my answer. Don’t use locate(), it will match partially on CSV elements, which is not what you want. – GMB Jan 29 '20 at 17:47
  • Thank you so much! Thats a perfect solution, and **such** a good call about partial matching. – micomec Jan 29 '20 at 17:57