0

I have a project where on a weekly basis I import several large datasets that contain incorrect data, such as duplicate employee IDs where they should not be duplicates. In an effort to flag the duplicates, I try the following code:

ALTER TABLE AccountDuplicates
  ADD UNIQUE INDEX EmployeeID (EmployeeID);

INSERT INTO AccountDuplicates
  SELECT
    EmployeeID,
    FirstName,
    LastName
  FROM AccountsWork
  ON DUPLICATE KEY UPDATE
    EmployeeID = CONCAT(VALUES(EmployeeID), '*');

The INSERT statement gives me errors, and I do not see what I am doing wrong:

[42000][1064] You have an error in your SQL syntax; check the manual that correspondsto your MySQL server version for the right syntax to use near 'FROM EAD_UserAccountsWork
           ON DUPLICATE KEY UPDATE EmployeeID = CONCAT(VALUES(E' at line 36

In case it is relevant, I am running MySQL 5.7.12 on OS X 10.11.4, INNODB engine and mysql_mode = ''. My intent here is to identify the duplicate IDs so I can forward them to the appropriate DBAs for correction.

Update: I have set the database defaults as follows:

[client]
default-character-set = utf8mb4

[mysqld]
sql_mode=''
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci

[mysql]
default-character-set = utf8mb4
Michael Sheaver
  • 2,059
  • 5
  • 25
  • 38

2 Answers2

0

I think it's because you have not qualified correctly the "source" EmployeeID

INSERT INTO AccountDuplicates
  SELECT
    EmployeeID,
    FirstName,
    LastName
  FROM AccountsWork t
  ON DUPLICATE KEY UPDATE
    EmployeeID = CONCAT(t.EmployeeID, '*');

the EmployeeID located left of the = refers to the AccountDuplicates table, the one on the right refers to the AccountsWork table

Thomas G
  • 9,886
  • 7
  • 28
  • 41
  • That helped, somewhat; now I am getting a slightly different error: [42000][1064] You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'FROM AccountsWork' at line 35 – Michael Sheaver Apr 24 '16 at 19:34
  • @MichaelSheaver retry with an alias for AccountsWork , I have adapted my query – Thomas G Apr 24 '16 at 19:37
  • Added the aliases and got the following: [42000][1064] You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'FROM EAD_UserAccountsWork t ON DUPLICATE KEY UPDATE EmployeeID = CONCAT(VALUES' at line 36 I wonder if it may be a MySQL 5.7 setting somewhere? I have had trouble deleting duplicates in this setup as well. I added an additional detail above. – Michael Sheaver Apr 24 '16 at 19:45
  • @MichaelSheaver even with `CONCAT(t.EmployeeID, '*')` instead of `CONCAT(VALUES...` ?? You dont need that `vALUES()` in your concat, I am not even sure it is proper MySQL syntax – Thomas G Apr 24 '16 at 19:54
  • Changed it to CONCAT(t.EmployeeID, '*') and still get: [42000][1064] You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'FROM EAD_UserAccountsWork ON DUPLICATE KEY UPDATE EmployeeID = CONCAT(t.Empl' at line 36 I even turned on 'Show invisibles" in y Aton editor to see if there are some hidden characters in there, and nothing there. – Michael Sheaver Apr 24 '16 at 20:06
0

your syntax looks ok, maybe there was a typing error? this almost works:

-- drop table AccountsWork ;
-- drop table  AccountDuplicates;
CREATE TABLE AccountsWork (
    EmployeeID varchar(16),
    FirstName INT,
    LastName INT
);
CREATE TABLE AccountDuplicates (
    EmployeeID  varchar(16),
    FirstName INT,
    LastName INT
);
    alter table AccountDuplicates add unique index(EmployeeID); 
insert into AccountsWork values('a',2,3);
insert into AccountsWork values(1,2,3);
insert into AccountsWork values('b',2,3);
insert into AccountsWork values('c',2,3);
insert into AccountsWork values('c',2,3);
insert into AccountsWork values('c',2,3);
insert into AccountsWork values('c',2,3);
SELECT 
    *
FROM
    AccountsWork;
-- there is no syntax errors here (your original query):    
INSERT INTO AccountDuplicates
  SELECT
    EmployeeID,
    FirstName,
    LastName
  FROM AccountsWork
  ON DUPLICATE KEY UPDATE
    EmployeeID =  CONCAT(VALUES(EmployeeID), '*');

SELECT 
    *
FROM
    AccountDuplicates;    

simple insert with duplicate keys works, but "select from table on duplicate key update" does not work.

I think you need to look in this question: INSERT INTO ... SELECT FROM ... ON DUPLICATE KEY UPDATE

It looks like mysql parser going crazy with this kind of queries

Community
  • 1
  • 1
strangeqargo
  • 1,276
  • 1
  • 15
  • 23