15

I keep track of all the http_user_agents that visit me, with a simple hit counter. The below insert the http_user_agent in the DB, this field is Case Insensitive and is Unique. So when we try to insert it and it finds a DUPLICATE KEY, it adds 1 to the hits field.

The problem is my Auto Increment field still increases even though we did not insert a field. how can i prevent this?

$sql = "INSERT INTO `db_agency_cloud`.`tblRefHttpUsersAgent` SET `http_users_agent` = :UsersAgent, `created_ts` = NOW() ON DUPLICATE KEY UPDATE `hits` = `hits` + 1";

Here is the Table stucture:

CREATE TABLE `tblRefHttpUsersAgent`
(
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`http_users_agent` varchar(255) NOT NULL,
`hits` int(20) unsigned NOT NULL DEFAULT '1',
`created_ts` datetime NOT NULL,
`activity_ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `http_users_agent` (`http_users_agent`)
)
ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1;
nawfal
  • 70,104
  • 56
  • 326
  • 368
M. of CA
  • 1,496
  • 2
  • 22
  • 32

4 Answers4

21

INSERT ... ON DUPLICATE KEY UPDATE is described as a "mixed-mode insert" for the purposes of InnoDB's AUTO_INCREMENT handling. Mixed-mode inserts are basically ones where the maximum number of required AUTO_INCREMENT values is known, but the amount that will actually be needed is not.

Mixed-mode inserts get handled specially by default, as described in the MySQL docs:

...for “mixed-mode inserts”... InnoDB will allocate more auto-increment values than the number of rows to be inserted. However, all values automatically assigned are consecutively generated (and thus higher than) the auto-increment value generated by the most recently executed previous statement. “Excess” numbers are lost.

If you're using InnoDB, your alternatives are:

  1. Avoid INSERT ... ON DUPLICATE KEY UPDATE.
  2. Set the innodb_autoinc_lock_mode parameter to 0, for "traditional" autoincrement lock mode, which guarantees that all INSERT statements will assign consecutive values for AUTO_INCREMENT columns. However, this is accomplished by locking during the statement, so there's a performance loss associated with this setting.
  3. (Recommended) Ignore the gaps in the AUTO_INCREMENT column.

Note: AUTO_INCREMENT handling is totally different under MyISAM, which does not exhibit this behavior.

John Flatness
  • 32,469
  • 5
  • 79
  • 81
  • Trying to set the `innodb_autoinc_lock_mode` during runtime gave me the error: `SQLSTATE[HY000]: General error: 1238 Variable 'innodb_autoinc_lock_mode' is a read only variable'`, considering that I need this for a bulk UPSERT, because individual inserts are insanely slow. I can't use option 1, 2 or 3, because the gaps burn the autoinc ids to insane numbers. Can you suggest another solution? – Timo Huovinen Oct 27 '13 at 13:09
  • Would similar gaps in auto-increment field be observed if I use PostgreSQL ? – W.M. Aug 07 '16 at 09:36
  • Thank you for the great explanation. Although recommended by you, the 3rd option was not applicable to my case where we try to do multiple `INSERT ... ON DUPLICATE KEY UPDATE` in different transactions at the same time which caused constant deadlocks. Just my two cents, to keep in mind. – recepinanc Aug 13 '21 at 17:36
  • Ignoring the gaps is one thing, but having to change my id columns to BIGINTs just so that thousands of `INSERT ... ON DUPLICATE KEY UPDATE` queries can run daily is annoying at best. – twhitney Mar 22 '22 at 18:01
7

The storage engine must increment the AUTO_INCREMENT value before inserting a row. It does not know if the insert will fail yet at that point. It cannot simply roll back the increment because there may be other inserts happening on other connections concurrently. This is normal behavior and not something you should (or can) change. The purpose of AUTO_INCREMENT is to provide unique identifiers, not an unbroken sequence of numbers.

Dan Grossman
  • 51,866
  • 10
  • 112
  • 101
  • yes but i didnt want to make two calls to the database. I was hoping for a solutions that skiped this. Iam thinking maybe of a if statement. then checking at the database level, to see if it should insert or update a record. – M. of CA Aug 17 '11 at 06:46
  • 2
    +1: And interesting! OK, I grok _why_ this occurs (thanks to your answer); it certainly seems unfortunate, though. It's not intuitive at all and I'd even consider it a **bug** (either in the SQL standard, or in MySQL)... even when it's one that would always be resolved as `WONTFIX`. – Lightness Races in Orbit Aug 17 '11 at 23:25
  • Hmm. ["If you use INSERT IGNORE and the row is ignored, the AUTO_INCREMENT counter is not incremented and LAST_INSERT_ID() returns 0, which reflects that no row was inserted."](http://dev.mysql.com/doc/refman/5.5/en/information-functions.html#function_last-insert-id) It manages it there. Discrepancy? – Lightness Races in Orbit Aug 17 '11 at 23:38
  • http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html confuses me a bit. It doesn't appear to say anything pertinent to updating the `AUTO_INCREMENT` counter or otherwise but, as a curiosity, the last bit about making `LAST_INSERT_ID()` useful has me completely zoned out. I have no idea what it's talking about. – Lightness Races in Orbit Aug 17 '11 at 23:41
  • @Tomalak: That stuff is talking about what `LAST_INSERT_ID()` does when you pass it an argument: it returns that value, plus it sets that value as what the *next* call to `LAST_INSERT_ID()` should return. So, including the `LAST_INSERT_ID()` call in that query forces a subsequent call to `LAST_INSERT_ID()` to return the ID of the row affected by the statement, regardless of whether a new row was inserted (which works already without that "hack) or an old row was updated. – John Flatness Aug 18 '11 at 00:14
  • @DanGrossman: Still, it's annoying and _feels_ a bit messy. The intuition is "ugh what a waste of all those poor IDs I'm not using" – Lightness Races in Orbit Aug 18 '11 at 08:43
1

Unfortunately the solution is at the application level if you don't want to affect auto-increment id's. Do a SELECT first and count the result rows. If 0 results, INSERT the data. If more than 0, UPDATE that row.

Eliot
  • 5,450
  • 3
  • 32
  • 30
  • yes but i didnt want to make two calls to the database. I was hoping for a solutions that skiped this. Iam thinking maybe of a if statement. then checking at the database level, to see if it should insert or update a record. – M. of CA Aug 17 '11 at 06:46
0

You can first calculate the maximum number of inserted rows and add 1 to it,

(SELECT MAX(`id`)+1 FROM `tblRefHttpUsersAgent`)

then alter the table AUTO_INCREMENT using some variables SET @NEW_ID and PREPARE / EXECUTE statements.

here is a simple solution for the same problem and below is the finished version if you like the solution of your own specific problem:

    $sql = 'SET @NEW_AI = (SELECT MAX(`id`)+1 FROM `tblRefHttpUsersAgent`);
                    SET @ALTER_SQL = CONCAT("ALTER TABLE `tblRefHttpUsersAgent` AUTO_INCREMENT =", @NEW_AI);
                    PREPARE NEWSQL FROM @ALTER_SQL;
                    EXECUTE NEWSQL;';

    $sql .= 'INSERT INTO `db_agency_cloud`.`tblRefHttpUsersAgent` SET `http_users_agent` = :UsersAgent, `created_ts` = NOW() ON DUPLICATE KEY UPDATE `hits` = `hits` + 1';