0

I have two tables, targets and rulesets. The rulesets table contains a column target_ids that is a CSV of primary key IDs into targets, and I want a SELECT to return any elements in the CSV field for a particular row have lost referential integrity to the targets table.

Is there any way to accomplish this without needing to create a temporary table?

I'm aware of some solutions to this issue that will extract the CSV values into a temporary table, such as this answer: https://stackoverflow.com/a/17043084/342692, but I'm trying to avoid a temporary table due to permission restrictions. I'm also aware of FIND_IN_SET being useful to compare a given value to a CSV list, but it doesn't seem to help in this case.

Here's an example of the schema and the goal, and a link to DB fiddle with the same content:

CREATE TABLE IF NOT EXISTS `targets` (
  `id` int(6) unsigned,
  PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;

CREATE TABLE IF NOT EXISTS `rulesets` (
  `id` int(6) unsigned,
  `target_ids` text,
  PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;

INSERT INTO `targets` VALUES
  (1),
  (2),
  (3);

INSERT INTO `rulesets` VALUES
  (1, '2,3,4');
-- what targets exist in `targets` table but not in the CSV ruleset?
-- returns [1]
select t.id from targets t where find_in_set(t.id, (select target_ids from rulesets where id=1)) = 0;

-- what targets exist in both the `targets` table and the CSV ruleset?
-- returns [2,3]
select t.id from targets t where find_in_set(t.id, (select target_ids from rulesets where id=1)) > 0;

-- what targets do not exist in `targets` table but are in the CSV ruleset?
-- returns [4]
-- ??? 
danske
  • 637
  • 1
  • 6
  • 11
  • To my mind, it's simple: normalise your schema, or use something other than an RDBMS for this – Strawberry Mar 31 '21 at 00:31
  • Thanks @Strawberry - I agree that this problem could be effectively circumvented by changing the requirements, and that this could arguably lead to a better solution. For the sake of both curiosity and the unfortunate reality that we engineers must sometimes work with sub-optimal existing software decisions, I am still curious if the problem can be solved as I've stated it. – danske Apr 02 '21 at 00:09
  • The fiddle indicates 5.7. Is that actually the version you're using? – Strawberry Apr 02 '21 at 07:08
  • Yep, using 5.7. – danske Apr 02 '21 at 17:35

1 Answers1

1

Here's one idea, using a simple utility table (ints) of integers (i) from 0-9...

-- what targets exist in `targets` table but not in the CSV ruleset?

SELECT a.id target_id 
  FROM targets a
  LEFT
  JOIN
     ( SELECT DISTINCT r.id
                     , SUBSTRING_INDEX(SUBSTRING_INDEX(r.target_ids,',',i.i+1),',',-1) target_id
                  FROM rulesets r
                  JOIN ints i
     ) b
    ON b.target_id = a.id
 WHERE b.id IS NULL;


-- what targets exist in both the `targets` table and the CSV ruleset?

SELECT a.id target_id
  FROM targets a
  JOIN
     ( SELECT DISTINCT r.id
                     , SUBSTRING_INDEX(SUBSTRING_INDEX(r.target_ids,',',i.i+1),',',-1) target_id
                  FROM rulesets r
                  JOIN ints i
     ) b
    ON b.target_id = a.id;
    
-- what targets do not exist in `targets` table but are in the CSV ruleset?
SELECT b.target_id 
  FROM targets a
  RIGHT
  JOIN
     ( SELECT DISTINCT r.id
                     , SUBSTRING_INDEX(SUBSTRING_INDEX(r.target_ids,',',i.i+1),',',-1) target_id
                  FROM rulesets r
                  JOIN ints i
     ) b
    ON b.target_id = a.id
 WHERE a.id IS NULL;

For the this last query, normally, you'd reverse the order of the tables above so as to avoid using a RIGHT JOIN, but I've written it this way to highlight its similarity to the first query

Strawberry
  • 33,750
  • 13
  • 40
  • 57
  • Definitely an interesting solution! Still have to create that table but I kind of doubt that there's any way to do it in a more straightforward fashion than you've shown. Thanks for spending the time on it. – danske Apr 03 '21 at 00:29
  • Well upgrading would dispense with the need for the utility table, as would normalising your schema, of course. – Strawberry Apr 03 '21 at 05:47
  • Interesting. What is the MySQL feature that is available in later versions that would allow not using the utility table? – danske Apr 04 '21 at 06:12
  • 1
    Common Table Expressions essentially allow you to do the same thing 'on-the-fly'. – Strawberry Apr 04 '21 at 07:11