1

I am using Laravel 4.2 and have a query:

    DB::table('data')->whereIn('t_id', $new_ids);
                ->whereNotIn('l_id', $old_ids);
                ->groupBy('l_id')->update(array('t_id' => $new_t_id));

This causes an error:

500 - SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '1569648-7302' for key 'data_l_id_t_id_unique' (SQL: update `data` set `t_id` = 7302 where `t_id` in (4772, 4860, 4861, 5653, 6396, 6743) and `l_id` not in (2994190)) @ /

The problems seems to be with groupBy() clause which is not being executed, thus making duplicate l_id to be pulled into the query:

mysql> select l_id from data where t_id=7302;
+---------+
| l_id |
+---------+
| 2994190 |
+---------+
1 row in set (0.00 sec)


mysql> select l_id from data where t_id in (4772, 4860, 4861, 5653, 6396, 6743);
+---------+
| l_id |
+---------+
| 1569648 |
| 1593870 |
| 1594096 |
| 1628872 |
| 1569648 |
| 1593870 |
| 1594096 |
| 1628872 |
| 1569648 |
| 1593870 |
| 1594096 |
| 1628872 |
| 1879092 |
| 2283518 |
| 2284586 |
| 2604466 |
+---------+
16 rows in set (0.00 sec)



mysql> select l_id from data where t_id in (4772, 4860, 4861, 5653, 6396, 6743) GROUP BY l_id;
+---------+
| l_id |
+---------+
| 1569648 |
| 1593870 |
| 1594096 |
| 1628872 |
| 1879092 |
| 2283518 |
| 2284586 |
| 2604466 |
+---------+
8 rows in set (0.00 sec)

Schema:

mysql> show create table data;

CREATE TABLE `data` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`l_id` bigint(20) unsigned NOT NULL,
`t_id` bigint(20) unsigned NOT NULL,
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `data_l_id_tag_id_unique` (`l_id`,`t_id`),
KEY `data_t_id_foreign` (`t_id`),
CONSTRAINT `data_l_id_foreign` FOREIGN KEY (`l_id`) REFERENCES `lis` (`id`),
CONSTRAINT `data_t_id_foreign` FOREIGN KEY (`t_id`) REFERENCES `tas` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4544794 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

Basically I need to update t_id for certain l_ids but still ensure no duplicate t_id/l_id happen. I could do this by looping through each t_id and checking for duplicates before updating but thought a shortcut via groupBy() would be a better way of doing it.

Is it possible to make Laravel do a groupBy() while updating? More generally can an update be executed while checking for duplicates, even in plain SQL?

Edit: Separating update from group by

Making UPDATE and GROUP BY seperate helps resolve GROUP BY problem but not duplicate problem:

    $required_l_ids = DB::table('data')->whereIn('t_id', $new_ids);
                ->whereNotIn('l_id', $old_ids);
                ->groupBy('l_id')->lists('l_id');


    if ( !empty($required_l_ids) ) {
            DB::table('data')->whereIn('l_id', $required_l_ids)->whereIn('t_id', $new_ids)->update(array('t_id' => $new_tag_id));
    }

Still gives an error:

500 - SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '1593870-7302' for key 'data_l_id_t_id_unique' (SQL: update `data` set `t_id` = 7302 where `l_id` in (1593870, 1594096, 1628872, 1879092, 2283518, 2284586, 2604466) and `t_id` in (4772, 4860, 4861, 5653, 6396, 6743)) @ /

Edit 2: Sample data

CREATE TABLE `data` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`l_id` bigint(20) unsigned NOT NULL,
`t_id` bigint(20) unsigned NOT NULL,
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `data_l_id_t_id_unique` (`l_id`,`t_id`)
) ENGINE=InnoDB AUTO_INCREMENT=4544794 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;


INSERT into data (l_id, t_id) VALUES (1569648,7302);
INSERT into data (l_id, t_id) VALUES (2994190,7302);
INSERT into data (l_id, t_id) VALUES (1593870,4772);
INSERT into data (l_id, t_id) VALUES (1594096,4772);
INSERT into data (l_id, t_id) VALUES (1628872,4772);
INSERT into data (l_id, t_id) VALUES (1569648,4860);
INSERT into data (l_id, t_id) VALUES (1593870,4860);
INSERT into data (l_id, t_id) VALUES (1594096,4860);
INSERT into data (l_id, t_id) VALUES (1628872,4860);
INSERT into data (l_id, t_id) VALUES (1569648,4861);
INSERT into data (l_id, t_id) VALUES (1593870,4861);
INSERT into data (l_id, t_id) VALUES (1594096,4861);
INSERT into data (l_id, t_id) VALUES (1628872,4861);
INSERT into data (l_id, t_id) VALUES (1879092,5653);
INSERT into data (l_id, t_id) VALUES (2283518,6396);
INSERT into data (l_id, t_id) VALUES (2284586,6396);
INSERT into data (l_id, t_id) VALUES (2604466,6743);


UPDATE data AS d1 LEFT JOIN data AS d2 ON d1.l_id = d2.l_id AND d2.t_id = 7302 SET d1.t_id = 7302 WHERE d1.t_id IN (4772,4860,4861,5653,6396,6743) AND d1.l_id NOT IN (1569648,2994190) AND d2.l_id IS NULL;

sqlfiddle: http://sqlfiddle.com/#!9/e9a50

Error: Duplicate entry '1593870-7302' for key 'data_l_id_t_id_unique'

Mugoma J. Okomba
  • 3,185
  • 1
  • 26
  • 37
  • Separate update operation from group by method? – Bartłomiej Sobieszek Sep 27 '16 at 19:26
  • @BartłomiejSobieszek I tried separating `groupBy()` and `update` but still ended up with duplicate error, though it resolved the `groupBy`. See updated question. – Mugoma J. Okomba Sep 27 '16 at 19:51
  • `GROUP BY` is for combining rows when you're selecting data. It makes no sense to use it when updating. Even if it worked, how would it solve your problem of not creating duplicates? – Barmar Sep 27 '16 at 20:03
  • @Barmar Other than looping though each t_id and checking for duplicates on each, how can one make an update and still ensure no duplicates happen? – Mugoma J. Okomba Sep 27 '16 at 20:11

1 Answers1

1

I don't know the Laravel syntax, but I think this is the MySQL syntax for what you want:

UPDATE data AS d1
JOIN (SELECT l_id, MIN(t_id) AS min_t_id
      FROM data
      WHERE d1.t_id IN ($new_ids)
      AND d1.l_id NOT IN ($old_ids)
      GROUP BY l_id) AS d3 ON d1.l_id = d3.l_id AND d1.t_id = d3.min_t_id
LEFT JOIN data AS d2 ON d1.l_id = d2.l_id AND d2.t_id = $new_tag_id
SET d1.t_id = $new_tag_id
WHERE d2.l_id IS NULL

This combines an UPDATE with the LEFT JOIN/NULL pattern in Return row only if value doesn't exist

The first JOIN makes sure that only one row for each l_id is updated, so you don't create duplicates. It arbitrarily chooses the lowest t_id to replace.

DEMO

Community
  • 1
  • 1
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Plain SQL is still fine. But the query you gave gives `ERROR 1052 (23000): Column 't_id' in where clause is ambiguous ` and `ERROR 1052 (23000): Column 'l_id' in where clause is ambiguous`, If I put d1,t_id and d2.l_id I get `Rows matched: 0 Changed: 0 Warnings: 0` which means no duplicates but nothing getting updated – Mugoma J. Okomba Sep 27 '16 at 20:27
  • Fixed. I originally wrote this using `WHERE NOT EXISTS`, then changed it to `LEFT JOIN`, and forgot to add all the aliases. – Barmar Sep 27 '16 at 20:28
  • The plain SQL also gives duplicate error: `UPDATE data AS d1 LEFT JOIN data AS d2 ON d1.l_id = d2.l_id AND d2.t_id = 7302 SET d1.t_id = 7302 WHERE d1.t_id IN (4772,4860,4861,5653,6396,6743) AND d1.l_id NOT IN (1569648,2994190) AND d2.l_id IS NULL;` gives `ERROR 1062 (23000): Duplicate entry '1593870-7302' for key 'data_l_id_t_id_unique'` – Mugoma J. Okomba Sep 27 '16 at 20:39
  • Can you make a sqlfiddle with some sample data. – Barmar Sep 27 '16 at 20:50
  • Updated question with sample data – Mugoma J. Okomba Sep 27 '16 at 21:29
  • I see the problem. My query prevents creating duplicates of existing rows, but if there are multiple rows with the same `l_id`, it will try to update all of them with the new `t_id`, and that creates duplicates. So it should only update one of them. How should it decide which? – Barmar Sep 27 '16 at 21:33
  • The way to do it is to join with a query that selects that row. – Barmar Sep 27 '16 at 21:34
  • If there are duplicates only one need to get updated and others ignored. It doesn't matter which one. What's important is that the l_id will be associated with new t_id. – Mugoma J. Okomba Sep 27 '16 at 21:43
  • Have to go now, I'll post the updated query late tonight or tomorrow. – Barmar Sep 27 '16 at 21:49
  • I've updated my answer to add a join with a subquery that selects one row per `l_id`. sqlfiddle.com is acting up so I haven't been able to test it yet. – Barmar Sep 28 '16 at 15:08
  • sqlfiddle is back, I've added a demo. – Barmar Sep 28 '16 at 15:23