I am creating a new unique index of two fields:
ALTER TABLE `tools`
ADD UNIQUE INDEX `name_url` (`name`, `url`);
The problem is that the db has already 500000 tools and other tables refer to it.What's the best strategy to remove the doubles?
I am creating a new unique index of two fields:
ALTER TABLE `tools`
ADD UNIQUE INDEX `name_url` (`name`, `url`);
The problem is that the db has already 500000 tools and other tables refer to it.What's the best strategy to remove the doubles?
To update another table that refers to this table, do:
UPDATE otherTable AS o
JOIN tools AS t1 on o.tools_id = t1.id
JOIN (SELECT name, url, MIN(id) AS minid
FROM tools
GROUP BY name, url) AS t2
ON t1.name = t2.name AND t1.url = t2.url AND t1.id != t2.minid
SET o.tools_id = t2.minid
This sets all foreign keys to point to the minimum ID for each (name, url)
pair. You'll have to do this for every other table that has foreign keys pointing to tools
.
Then delete all the extra rows:
DELETE t1.* FROM tools AS t1
JOIN (SELECT name, url, MIN(id) AS minid
FROM tools
GROUP BY name, url) AS t2
ON t1.name = t2.name AND t1.url = t2.url AND t1.id != t2.minid
Once you've done all this you can add the unique index.