5

Here's my database schema

+------------------+------------------+------+-----+---------------------+----------------+
| Field            | Type             | Null | Key | Default             | Extra          |
+------------------+------------------+------+-----+---------------------+----------------+
| phone_number     | varchar(64)      | NO   | UNI | NULL                |                |
| id               | int(10) unsigned | NO   | PRI | NULL                | auto_increment |
+------------------+------------------+------+-----+---------------------+----------------+

I'd like to be able to insert several phone numbers (phone_number is a unique key) at once, but I don't want to increment the auto_increment field if I have duplicates.

If I do

INSERT INTO
   phone_numbers (phone_number) 
VALUES
   (
       % 
   VALUES
       % 
   )
   ON DUPLICATE KEY 
   UPDATE
      id = id;

the auto_increment will increase even for duplicates.

This question: Prevent auto increment on MySQL duplicate insert doesn't handle bulk inserts. I'd like to do something like this:

INSERT INTO
   phone_numbers (phone_number) 
   SELECT
      '12345',
      '123456' 
   FROM
      DUAL 
   WHERE
      NOT EXISTS
      (
         SELECT
            phone_number 
         FROM
            phone_numbers 
         WHERE
            phone_number IN 
            (
               '12345',
               '123456'
            );

but the DUAL table doesn't really handle multiple values well.

Any ideas? MySQL 5.5.

eglease
  • 2,445
  • 11
  • 18
  • 28
Newtang
  • 6,414
  • 10
  • 49
  • 70
  • Quite seriously, why does it matter? – eggyal Nov 10 '12 at 01:39
  • I am curious why you care about the auto_increment field increasing. – Andrew T Finnell Nov 10 '12 at 01:41
  • Because I could hit the ceiling for int unsigned. Yes, I know I could bump it up to `mediumint` or `bigint`, but I'd prefer not to if I didn't have to. I'm going to be dealing with far more duplicates than actual values. – Newtang Nov 10 '12 at 01:44

3 Answers3

3

After reading the other article one way would be to have a temp table that you bulk insert into. Then select the rows from the temp table into our actual table. The duplicate rows would be removed at that point and the auto_increment field in the actual table would be correct.

CREATE TABLE PHONE_NUMBERS (id int(10) NOT NULL AUTO_INCREMENT, phone_number varchar(64), primary key (id), unique(phone_number) );

CREATE TEMPORARY TABLE TMP_PHONE_NUMBERS ( phone_number varchar(64), unique(phone_number) );

' bulk insert
INSERT INTO tmp_phone_numbers (phone_number) VALUES (%values%)

' remove phone numbers that already exist. This will create a unique
' set of phone numbers that do not exist in the real table.
DELETE FROM tmp_phone_numbers WHERE phone_number in (SELECT phone_number from phone_numbers);

' copy into real table
INSERT INTO phone_numbers (phone_number) SELECT phone_number FROM tmp_phone_numbers;

' Temp table is dropped when your connection is closed.

Here is another option:

If you know you won't hit int(10) in the first bulk load or any subsequent bulk loads you can use INSERT INTO phone_numbers (phone_number) VALUES (%values%) ON DUPLICATE KEY UPDATE id=id; which will leave gaps in the id field. But after you are done bulk loading, you can remove the ID column, then re-add it back which will recreate all of your id's with no gaps.

Andrew T Finnell
  • 13,417
  • 3
  • 33
  • 49
  • I think I see where you're headed. Could I trouble you for an example query? – Newtang Nov 10 '12 at 01:47
  • Suggest you use `TEMPORARY TABLE` for this purpose. You might also want to perform a lookup into `phone_numbers` prior to insertion (I guess that was kind of the point?). – eggyal Nov 10 '12 at 02:06
  • @eggyal If he can ensure that all this is done in a single connection I agree. – Andrew T Finnell Nov 10 '12 at 02:08
  • I might be misunderstanding, but I think you might be missing something. Inserting the values from tmp_phone_numbers into phone_numbers doesn't prevent auto_increment to increase for duplicates. Something needs to filter out the phone numbers that are already in phone_numbers. – Newtang Nov 10 '12 at 03:02
  • @Newtang The Unique attribute on the phone_number column in tmp_phone_numbers will filter out the duplicate phone numbers. Ah I see now. Updating answer. – Andrew T Finnell Nov 10 '12 at 03:31
0

Since INSERT IGNORE does increase the auto-increment counter on the table (even when making no changes to the data), a simple work-around would be to restore the auto-increment value at the end of the transaction by leveraging the engine logic that always chooses an integer greater than the presently highest AI value when trying to force a value that is too low:

ALTER TABLE phone_numbers AUTO_INCREMENT = 0;

Doing this does not require a table rebuild, and can be tested as follows:

SELECT MAX(id) FROM phone_numbers;
INSERT IGNORE INTO phone_numbers (phone_number) VALUES ("0123456789");
SELECT MAX(id) FROM phone_numbers;
INSERT IGNORE INTO phone_numbers (phone_number) VALUES ("0123456789");
SELECT MAX(id) FROM phone_numbers;
ALTER TABLE phone_numbers AUTO_INCREMENT = 1;
INSERT IGNORE INTO phone_numbers (phone_number) VALUES ("1000000000");
SELECT MAX(id) FROM phone_numbers;

+---------+
| MAX(id) |
+---------+
|    NULL |
+---------+
1 row in set (0.00 sec)

Query OK, 1 rows affected (0.00 sec)

+---------+
| MAX(id) |
+---------+
|       1 |
+---------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

+---------+
| MAX(id) |
+---------+
|       1 |
+---------+
1 row in set (0.00 sec)

Query OK, 1 rows affected (0.33 sec)
Records: 1  Duplicates: 0  Warnings: 0

Query OK, 0 rows affected (0.00 sec)

+---------+
| MAX(id) |
+---------+
|       2 |
+---------+
1 row in set (0.00 sec)
Christopher McGowan
  • 1,351
  • 10
  • 10
  • You might be thinking of composite key incrementing. InnoDB just uses the index required for the auto-increment attribute (usually just the clustered index when you set primary key), so when you perform that alter, and the value specified is less than the right-most value in the tree, it will silently substitute that right-most value. – Christopher McGowan Dec 20 '19 at 15:33
-2

Just use INSERT IGNORE. That will ignore any inserts that fail because of duplicate keys. The rest of the inserts will go through fine.

Brent Baisley
  • 12,641
  • 2
  • 26
  • 39
  • This isn't about inserts that fail because of duplicate keys. – Newtang Nov 10 '12 at 06:58
  • I know. But INSERT IGNORE will only insert new records and won't auto increment any existing records. It won't even touch existing records, thus solving the stated problem. – Brent Baisley Nov 10 '12 at 21:02