0

There are two tables:

Firstly, req_data, which contains a column named req_id.

Secondly, tc_data which contains a column named Out-Link. That column holds (possibly) multiple values of a possible req_data.req_id value.

I got the following query

SELECT * FROM tc_data WHERE `Out-Link` LIKE '%//specific req_data.req_id value//%'

which returns for a specific value the results I want.I just don't know how to modify this query in order to get a result set which would look as if I would have executed the above query for every req_data.req_id value in the req_data.req_id column.

I tried

SELECT * FROM tc_data INNER JOIN req_data ON tc_data.`Out-Link` LIKE CONCAT('%',req_data.req_id,'%')

with no luck (takes very long to execute and does not give desired result).

Edit 1

Sample data req_data:

     req_id         |  *some other columns...*
-------------------------------------------------
abc_ersed_1023      |  ...
-------------------------------------------------
dkd_asdf-2132       |  ...
-------------------------------------------------
mcd-sad_sdf_120323  |  ...

sample data tc_data

    Out-Link         |   some other columns ...
---------------------|--------------------------
dkd_asdf-2132        |...
mcd-sad_sdf_120323   |
------------------------------------------------
mcd-sad_sdf_120323   |...
------------------------------------------------
                     |...
---------------------|--------------------------
abc_ersed_1023       |...
------------------------------------------------

Note that req_id's can occur in multiple rows in tc_data.

VGD
  • 436
  • 1
  • 5
  • 25
  • 1
    It will be slow because that `LIKE` test can't be indexed, so it has to test every combination. But the syntax looks correct. Can you show some sample data and the desired results? – Barmar Sep 11 '15 at 09:33
  • Is `Out-Link` a comma-separated list, and you want to match the values in the list? Then you should use `FIND_IN_SET`, not `LIKE`. – Barmar Sep 11 '15 at 09:36
  • `Out-link` is separated by `\r\n` – VGD Sep 11 '15 at 10:42
  • You can use any delimiter with `FIND_IN_SET`. But it's better to normalize your tables. – Barmar Sep 11 '15 at 10:47
  • as mentioned in comment below changing the design would be a huge effort and does not fit into my time schedule – VGD Sep 11 '15 at 10:57
  • OK, then use `FIND_IN_SET`, as explained in the question I marked this as a duplicate of. – Barmar Sep 11 '15 at 10:59
  • I made a mistake closing this question. `FIND_IN_SET` can only be used with comma-separated lists, it doesn't allow you to specify a custom delimiter. – Barmar Sep 11 '15 at 11:13
  • I tried FIND_IN_SET(req_data.req_id,replace(tc_data.`Out-Link`,'\r\n',',')), but still not the desired result – VGD Sep 11 '15 at 11:14
  • Can you make a sqlfiddle with sample data? – Barmar Sep 11 '15 at 11:15
  • Are you sure it contains `\r\n` and not just `\n`? – Barmar Sep 11 '15 at 11:16
  • doing SELECT * FROM tc_data WHERE FIND_IN_SET('specific value',replace(tc_data.`Out-Link`,'\r\n',',')) works, however doing SELECT * FROM tc_data INNER JOIN req_data ON FIND_IN_SET(req_data.req_id,replace(tc_data.`Out-Link`,'\r\n',',')) is apparently never ending... `tc_data`contains about 14000 rows, `req_data`contains about 4000 rows – VGD Sep 11 '15 at 11:22
  • This type of query can't be indexed. It's going to be very slow. Sorry. – Barmar Sep 11 '15 at 11:23
  • never ending as in 188 seconds – VGD Sep 11 '15 at 11:26
  • It has to do that `REPLACE` every time it compares two rows, and `FIND_IN_SET` is also slow. When you have a poorly designed database, you don't get good results. – Barmar Sep 11 '15 at 11:31
  • can you tell me why it is returning more rows than any of the tables have? I understand the query like that: Take `req_data's` `req_id`, take `tc_data's` `Out-Link` and return the row of the corresponding `req_data.req_id` if this `req_id`is found in the corresponding `tc_data.Out-Link` row. Is my understanding correct? If yes, shouldn't be the maximum of returned rows be maxof(`req_data`)? – VGD Sep 11 '15 at 11:33
  • It's returning a result row for each pair that matches. If one `Out-Link` matches multiple `req_id`, it will return a result for each of them. – Barmar Sep 11 '15 at 11:40

1 Answers1

0

You don't have the desired result because your LIKE clause can mix IDs, for instance, LIKE '%1%' will also find IDs 10, 11, etc. As said by @Barmar, you should first split your field according to a separator, then use FIND_IN_SET, but the performances would remain as bad, because no index could be used.

About the poor performances:

  • first, the design of your tables is very bad. It seems that you have a many-to-many relation between the entities in req_data and in tc_data. In such cases, you need a table to do the link, for instance:

CREATE TABLE req_to_tc_data ( 
    req_id ..., 
    tc_id ...);
  • In case it is not possible to change the design of your tables: the problem is that a query using a LIKE clause as yours can't use indexes. Assuming you use a MySQL database, if you wanted to be able to use an index, the Out-Link field should use a FULLTEXT index, and the query should use a MATCH...AGAINST... clause, see https://dev.mysql.com/doc/refman/5.6/en/fulltext-search.htm

    edit: oh, I just read in the documentation: "The search string must be a string value that is constant during query evaluation. This rules out, for example, a table column because that can differ for each row.". So the FULLTEXT index solution won't work. You have to fix the design of your database.

FBB
  • 1,414
  • 2
  • 17
  • 29
  • the database design is not mine, so it would be quiet some work to refactor it hence I would rather make it work if possible – VGD Sep 11 '15 at 10:43
  • I forgot to mention it but no id value in `req_data.req_id` is a prefix of another one – VGD Sep 11 '15 at 10:44
  • OK, so your query should return the desired result, unless you need to add a `DISTINCT` clause to filter duplicates? Also, you should try the `FIND_IN_SET` solution, but I suspect it won't improve the performances. A solution is to keep the tables the same, but to add the link table I mentioned, and to populate it based on the `Out-link` field. This could be done in a temporary table at server startup if you don't want to permanently modify the database. – FBB Sep 11 '15 at 10:58