93

Normally I can insert a row into a MySQL table and get the last_insert_id back. Now, though, I want to bulk insert many rows into the table and get back an array of IDs. Does anyone know how I can do this?

There are some similar questions, but they are not exactly the same. I don't want to insert the new ID to any temporary table; I just want to get back the array of IDs.

Can I retrieve the lastInsertId from a bulk insert?

Mysql mulitple row insert-select statement with last_insert_id()

Community
  • 1
  • 1
Peacemoon
  • 3,198
  • 4
  • 32
  • 56
  • Why can't you insert them one by one? – sanmai Sep 07 '11 at 12:00
  • You need to simulate the `OUTPUT` clause. I believe you can do this with a trigger in MySQL – Martin Smith Sep 07 '11 at 12:01
  • You can't use bulk insert and then a function lime last_insert_id(), it doesn't work. – N.B. Sep 07 '11 at 12:01
  • possible duplicate of [OUTPUT clause in mysql](http://stackoverflow.com/questions/5817414/output-clause-in-mysql) – Martin Smith Sep 07 '11 at 12:02
  • @Martin: That's not really a duplicate. This here is about fetching many ID's (`last_insert_id()`), not just one – Lukas Eder Sep 07 '11 at 12:10
  • @Peacemoon: Where do you need these ID's? In JDBC? – Lukas Eder Sep 07 '11 at 12:10
  • @Lukas - Ah OK. In SQL Server the trigger could do `SELECT * FROM INSERTED` to return the new ids to the client. Is this not possible in MySQL then? – Martin Smith Sep 07 '11 at 12:13
  • @Martin: I happen to be very curious about an answer to this one, actually. In Postgres you can write `INSERT .. RETURNING *` to return all rows you just inserted. MySQL doesn't have such a clause. With JDBC, you can almost always fetch all auto-generated ID's, but this here is not necessarily about JDBC. Let's wait and see – Lukas Eder Sep 07 '11 at 12:14
  • @Eder: i want to use it in PHP. Right now i'm using a hack, by calling the last_insert_id() and row_count(), then i create an array with elements range from last_insert_id() to last_insert_id() + row_count(). Don't know whether that is an elegant solution. – Peacemoon Sep 08 '11 at 12:57
  • 4
    No - that method is flawed too - unless you wrap the insert in a LOCK TABLES ... WRITE, and you should also allow for auto_increment_increment – symcbean Sep 09 '11 at 11:23

9 Answers9

90

Old thread but just looked into this, so here goes: if you are using InnoDB on a recent version of MySQL, you can get the list of IDs using LAST_INSERT_ID() and ROW_COUNT().

InnoDB guarantees sequential numbers for AUTO INCREMENT when doing bulk inserts, provided innodb_autoinc_lock_mode is set to 0 (traditional) or 1 (consecutive). Consequently you can get the first ID from LAST_INSERT_ID() and the last by adding ROW_COUNT()-1.

Tyler Christian
  • 520
  • 7
  • 14
Dag Sondre Hansen
  • 2,449
  • 20
  • 22
  • 12
    Do you need a transaction? What if other inserts came in the middle? –  Jul 16 '13 at 21:46
  • 4
    As long as you use the InnoDB-engine and assure that auto_increment locking is enabled (which may result in slower performance) no explicit transaction definition is needed. Since InnoDB performs a lock during the insert, other inserts must wait for the insert to finish. Have a look at http://dev.mysql.com/doc/refman/5.1/en/innodb-auto-increment-handling.html – Dag Sondre Hansen Jul 29 '13 at 10:14
  • As far as I know this cannot be done for MyISAM tables. – Dag Sondre Hansen Feb 07 '15 at 11:17
  • Why not? If you have auto increment for MyISAM tables, what's stopping from doing the same thing? Auto increment for MyISAM is sequential as well isn't it? Are you referring to the fact that inserts might happen in the middle between bulk inserting and running the last_insert_id() and row_count()? – CMCDragonkai Feb 09 '15 at 04:18
  • 1
    MyISAM is not a transactional db-engine and thus other inserts may happen while the bulk insert is in progress, messing up the sequence. Locking may prevent this (not too sure though) but the one solution I have found, which is documented by MySQL, is the one described above. – Dag Sondre Hansen Feb 10 '15 at 17:33
  • What if innodb_autoinc_lock_mode is set to 2 (“interleaved” lock mode) and u do a bulk insert inside of a transaction. Could it be, that when multiple servers do that at the same time, that the insertIds from one bulk insert are not sequential? – Wulf Jun 30 '16 at 07:27
  • @Wulf From the MySQL docs, about "interleaved" mode: "In this lock mode, no “INSERT-like” statements use the table-level AUTO-INC lock, and multiple statements can execute at the same time. This is the fastest and most scalable lock mode, but it is not safe when using statement-based replication or recovery scenarios when SQL statements are replayed from the binary log." – Dag Sondre Hansen Jun 30 '16 at 07:52
  • @Dag Sondre Hansen: But what does "no INSERT-like statements" mean? If only INSERT statements do create autoincrement values, is it also safe to use? ty! – Wulf Jun 30 '16 at 08:13
  • 1
    @Wulf I assume (assume underlined) it refers to INSERT, LOAD DATA, REPLACE INTO etc. – Dag Sondre Hansen Jun 30 '16 at 08:26
  • what if ON DUPLICATE KEY UPDATE is used? Is there a way to retrieve inserted/updated ids which are being auto incremented in case of insert, but not in case of update? – StackExploded Jul 17 '16 at 22:55
  • I would be afraid of using this if one of the rows failed.. Then processing the inserted ids back in your code would not match your initial array – NaturalBornCamper Oct 11 '16 at 08:02
  • why `-1` on the row_count()? – Adders Apr 21 '17 at 23:32
  • 1
    @Adders It's simple math: LAST_INSERT_ID holds the first ID. If for example only one row is returned the first value is LAST_INSERT_ID and the last LAST_INSERT_ID + 1 - 1 (the same ID). – Dag Sondre Hansen Apr 25 '17 at 10:23
  • 1
    I think it's also important to note that if you rely on a method like this to get multiple insert IDs, you won't be able to use multi-master clusters, like Galera Cluster, should you ever need such a thing. – JohnK Apr 03 '20 at 20:29
23

The only way I can think it could be done is if you store a unique identifier for each set of rows inserted (guid) then select the row ids. e.g:

INSERT INTO t1
(SELECT col1,col2,col3,'3aee88e2-a981-1027-a396-84f02afe7c70' FROM a_very_large_table);
COMMIT;

SELECT id FROM t1 
WHERE guid='3aee88e2-a981-1027-a396-84f02afe7c70';

You could also generate the guid in the database by using uuid()

Kevin Burton
  • 11,676
  • 2
  • 24
  • 37
5

Lets assume we have a table called temptable with two cols uid, col1 where uid is an auto increment field. Doing something like below will return all the inserted id's in the resultset. You can loop through the resultset and get your id's. I realize that this is an old post and this solution might not work for every case. But for others it might and that's why I'm replying to it.

# lock the table
lock tables temptable write;

#bulk insert the rows;
insert into temptable(col1) values(1),(2),(3),(4);

#get the value of first inserted row. when bulk inserting last_insert_id() #should give the value of first inserted row from bulk op.
set @first_id = last_insert_id();

#now select the auto increment field whose value is greater than equal to #the first row. Remember since you have write lock on that table other #sessions can't write to it. This resultset should have all the inserted #id's
select uid from temptable where uid >=@first_id;

#now that you are done don't forget to unlock the table.
unlock tables;
josliber
  • 43,891
  • 12
  • 98
  • 133
sundeep
  • 51
  • 1
  • 1
2

It's worth noting that @Dag Sondre Hansen's answer can also be implemented in case you have innodb_autoinc_lock_mode set to 2 by simply locking the table before insert.

LOCK TABLE my_table WRITE;
INSERT INTO my_table (col_a, col_b, col_c) VALUES (1,2,3), (4,5,6), (7,8,9);
SET @row_count = ROW_COUNT();
SET @last_insert_id = LAST_INSERT_ID();
UNLOCK TABLES;
SELECT id FROM my_table WHERE id >= @last_insert_id AND id <= @last_insert_id + (@row_count - 1);

Here's a fiddle demonstrating: https://www.db-fiddle.com/f/ahXAhosYkkRmwqR9Y4mAsr/0

1

I wouldn't be sure that auto increment value will increase item by 1. and there will be huge problems if your DB will have Master // Master replication and to resolve auto_increment duplicate exclusion. AI will be +2 instead of +1, also if there will be one more master it will come to +3. so relay on thing like AUTO_INCREMENT is going up for 1 is killing your project.

I see only some good options to do that.

this SQL snippet will have no problems with multiple masters and give good results until you will need only inserted records. on multiple requests without transactions can catch other inserts records.

START TRANSACTION;
SELECT max(id) into @maxLastId FROM `main_table`;
INSERT INTO `main_table` (`value`) VALUES ('first'), ('second') ON DUPLICATE KEY UPDATE `value` = VALUES(`value`);
SELECT `id` FROM `main_table` WHERE id > @maxLastId OR @maxLastId IS NULL;
COMMIT;

(if you will need also updated records by DUPLICATE KEY UPDATE) you will need to refactor database a bit and SQL will look like next, (safe for transactions and no transactions inside one connection.)

#START TRANSACTION    
INSERT INTO bulk_inserts VALUES (null);
SET @blukTransactionId = LAST_INSERT_ID();
SELECT  @blukTransactionId, LAST_INSERT_ID();
INSERT INTO `main_table` (`value`, `transaction_id`) VALUES ('first', @blukTransactionId), ('second', @blukTransactionId) ON DUPLICATE KEY UPDATE `value` = VALUES(`value`), `transaction_id` = VALUES(`transaction_id`);
SELECT  @blukTransactionId, LAST_INSERT_ID();
SELECT id FROM `main_table` WHERE `transaction_id` = @blukTransactionId;
#COMMIT

both cases are safe to transnational. first will show you only inserted records and second will give you all records even updated.

also those options will work even with INSERT IGNORE ...

Neznajka
  • 301
  • 2
  • 8
1

This thread is old but all these solutions did not help me so I came up with my own.

First, count how many rows you want to insert

let's say we need to add 5 rows:

LOCK TABLE tbl WRITE;

SELECT `AUTO_INCREMENT` FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'my_db' AND TABLE_NAME   = 'tbl'

then use the auto_increment just selected to do next query:

ALTER TABLE tbl AUTO_INCREMENT = {AUTO_INCREMENT}+5;
UNLOCK TABLES;

Finally do your inserts

Use the reserved autoincrement range to insert with id.

Warning: this solution requires elevated access level to the tables. But usually bulk inserts are run by crons and importer scripts and what not that may use special access anyway. You would not use this for just a few inserts.

This may leave unused id's if you use ON DUPLICATE KEY UPDATE.

Paul G Mihai
  • 226
  • 1
  • 6
0

For anyone using java with JDBC, it is possible. I am getting ids back with batch-insert doing it like this:

PreparedStatement insertBatch = null;
Connection connection = ....;

for (Event event : events) {
    
        if (insertBatch == null){
            insertBatch = connection.prepareStatement("insert into `event` (game, `type`, actor, target, arg1, arg2, arg3, created) " +
                "values (?, ?, ?, ?, ?, ?, ?, ?)", Statement.RETURN_GENERATED_KEYS);
        }
        
        insertBatch.setObject(1, event.game);
        insertBatch.setString(2, event.type);
        insertBatch.setObject(3, event.actor);
        insertBatch.setObject(4, event.target);
        insertBatch.setString(5, event.arg1);
        insertBatch.setObject(6, event.arg2);
        insertBatch.setObject(7, event.arg3);
        insertBatch.setTimestamp(8, new Timestamp(event.created.getTime()));
        insertBatch.addBatch();
    }
}

if (insertBatch != null){
    insertBatch.executeBatch();
    ResultSet generatedKeys = insertBatch.getGeneratedKeys();
    
    for (Event event : events) {

        if ( generatedKeys == null || ! generatedKeys.next()){
            logger.warn("Unable to retrieve all generated keys");
        }
        event.id = generatedKeys.getLong(1);
    }
    
    logger.debug("events inserted");
}

Source: "Using MySQL I can do it with JDBC this way:" - Plap - https://groups.google.com/g/jdbi/c/ZDqnfhK758g?pli=1

I have to actually add this to my JDBC url: rewriteBatchedStatements=true. Or else the actual inserts show up in the mysql "general query log" as separate rows. With 7000 rows inserted, I got 2m11s for regular inserts, 46s without rewrite.. on and 1.1s with rewrite.. on. Also, it does not make other people's inserts block (I tested that). When I inserted 200k rows, it grouped them into about 36k per line ie insert into abc(..) values(..),(..),(..)....

I am actually using JDBCTemplate so the way to access the PreparedStatement is:

ArrayList<Long> generatedIds = (ArrayList<Long>) jdbcTemplate.execute(
    new PreparedStatementCreator() {
        @Override
        public PreparedStatement createPreparedStatement(Connection connection) throws SQLException {
            return connection.prepareStatement(insertSql, Statement.RETURN_GENERATED_KEYS);
        }
    },
    new PreparedStatementCallback<Object>() {
        @Override
        public Object doInPreparedStatement(PreparedStatement ps) throws SQLException, DataAccessException {
            // see above answer for setting the row data
            ...
            ps.executeBatch();

            ResultSet resultSet = ps.getGeneratedKeys();
            ArrayList<Long> ids = new ArrayList<>();
            while (resultSet.next()) {
                ids.add(resultSet.getLong(1));
            }
            return ids;
        }
    }
);
Curtis Yallop
  • 6,696
  • 3
  • 46
  • 36
0

I think you will have to either handle the transaction id in your application, or the item id in your application in order to do this flawlessly.

One way to do this which could work, assuming that all your inserts succeed (!), is the following :

You can then get the inserted id's with a loop for the number of affected rows, starting with lastid (which is the first inserted id of the bulk insert). And thus, i checked it works perfectly .. just be careful that HeidiSQL for example will not return the correct value for ROW_COUNT(), probably because it's a crappy GUI doing random shit we don't ask it - however it's perfectly correct from either command line or PHP mysqli -

START TRANSACTION;
BEGIN;
INSERT into test (b) VALUES ('1'),('2'),('3');
SELECT LAST_INSERT_ID() AS lastid,ROW_COUNT() AS rowcount;
COMMIT;

In PHP it looks like this (local_sqle is a straight call to mysqli_query, local_sqlec is a call to mysqli_query + convert resultset to PHP array) :

local_sqle("START TRANSACTION;
BEGIN;
INSERT into test (b) VALUES ('1'),('2'),('3');");
$r=local_sqlec("SELECT LAST_INSERT_ID() AS lastid,ROW_COUNT() AS rowcount;");
local_sqle("
COMMIT;");
$i=0;
echo "last id =".($r[0]['lastid'])."<br>";
echo "Row count =".($r[0]['rowcount'])."<br>";

while($i<$r[0]['rowcount']){
    echo "inserted id =".($r[0]['lastid']+$i)."<br>";
    $i++;
}

The reason the queries are separated is because I wouldn't otherwise get my result using my own functions, if you do this with standard functions, you can put it back in one statement and then retrieve the result you need (it should be result number 2 - assuming you use an extension which handles more than one result set / query).

Morg.
  • 697
  • 5
  • 7
-13
$query = "INSERT INTO TABLE (ID,NAME,EMAIL) VALUES (NULL,VALUE1, VALUE2)";
$idArray = array();
foreach($array as $key) {
 mysql_query($query);
 array_push($idArray, mysql_insert_id());
}
print_r($idArray);
tomrozb
  • 25,773
  • 31
  • 101
  • 122
Ajayendra
  • 231
  • 2
  • 2