Problem
There may be a confusion for mysqlclient-python
/pymysql
users who expect executemany
of sqlite3
/apsw
to rewrite their INERT INTO table VALUES(?, ?)
into a multi-row INSERT
statement.
For instance, executemany
of mysqlclient-python
has this in its docstring:
This method improves performance on multiple-row INSERT and REPLACE. Otherwise it is equivalent to looping over args with execute().
Python stdlib's sqlite3.Cursor.executemany
doesn't have this optimisation. It's always loop-equivalet. Here's how to demonstrate it (unless you want to read some C, _pysqlite_query_execute
):
import sqlite3
conn = sqlite3.connect(':memory:', isolation_level=None)
conn.set_trace_callback(print)
conn.execute('CREATE TABLE tbl (x INTEGER, y INTEGER)')
conn.executemany('INSERT INTO tbl VALUES(?, ?)', [(i, i ** 2) for i in range(5)])
It prints:
CREATE TABLE tbl (x INTEGER, y INTEGER)
INSERT INTO tbl VALUES(0, 0)
INSERT INTO tbl VALUES(1, 1)
INSERT INTO tbl VALUES(2, 4)
INSERT INTO tbl VALUES(3, 9)
INSERT INTO tbl VALUES(4, 16)
Solution
Thus, you either need to rewrite these INSERT
s into multi-row one (manually or, for instance, with python-sql
) to stay in auto-commit mode (isolation_level=None
), or wrap your INSERT
s in a transaction (with sensible number of INSERT
s in one) in default implicit-commit mode. The latter means the following for the above snippet:
import sqlite3
conn = sqlite3.connect(':memory:')
conn.set_trace_callback(print)
conn.execute('CREATE TABLE tbl (x INTEGER, y INTEGER)')
with conn:
conn.executemany('INSERT INTO tbl VALUES(?, ?)', [(i, i ** 2) for i in range(5)])
Now it prints:
CREATE TABLE tbl (x INTEGER, y INTEGER)
BEGIN
INSERT INTO tbl VALUES(0, 0)
INSERT INTO tbl VALUES(1, 1)
INSERT INTO tbl VALUES(2, 4)
INSERT INTO tbl VALUES(3, 9)
INSERT INTO tbl VALUES(4, 16)
COMMIT
For further bulk-insert performance improvement in SQLite, I'd suggest to start with this overview question.