Late to the party, but I'm pondering something similar.
I created the following table to track active users on a license per day:
CREATE TABLE `license_active_users` (
`license_active_user_id` int(11) NOT NULL AUTO_INCREMENT,
`license_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`license_active_user_id`),
UNIQUE KEY `license_id` (`license_id`,`user_id`,`date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
In other words, 1 primary key and 1 unique index across the remaining 3 columns.
I then inserted 1 million unique rows into the table.
Attempting to re-insert a subset (10,000 rows) of the same data yielded the following results:
INSERT IGNORE
: 38 seconds
INSERT ... ON DUPLICATE KEY UPDATE
: 40 seconds
if (!rowExists("SELECT ..."))
INSERT
: <2 seconds
If those 10,000 rows aren't already present in the table:
INSERT IGNORE
: 34 seconds
INSERT ... ON DUPLICATE KEY UPDATE
: 41 seconds
if (!rowExists("SELECT ..."))
INSERT
: 21 seconds
So the conclusion must be if (!rowExists("SELECT ..."))
INSERT
is fastest by far - at least for this particular table configuration.
The missing test is if (rowExists("SELECT ...")){
UPDATE
} else {
INSERT
}
, but I'll assume INSERT ... ON DUPLICATE KEY UPDATE
is faster for this operation.
For your particular case, however, I would go with INSERT IGNORE
because (as far as I'm aware) it's an atomic operation and that'll save you a lot of trouble when working with threads.