1

How can I use executemany here to speed up the process.

with dest_conn.cursor() as dcur:
    while True:
        rows = scur.fetchmany(size=25)
        if rows:
            place_holders = "(%s)" % ','.join("?"*len(rows[0]))
            place_holders_list = ', '.join([place_holders] * len(rows))
            insert_query = "INSERT IGNORE INTO `%s` VALUES %s" % (tname, place_holders_list)
            dcur.execute(insert_query, (val for row in rows for val in row))

        else:
            log("No more rows found to insert")
            break

Here dcur destination cursor where to copy data and scur is source cursor where I am fetching data from

Even I am inserting 25 rows at once (I found this number is optimal for my db) I am creating a prepared statement and executing them. The manual of oursql says executemany is faster. It can send all the values in batch. How can I use it here instead of execute?

Charles
  • 50,943
  • 13
  • 104
  • 142
Shiplu Mokaddim
  • 56,364
  • 17
  • 141
  • 187

1 Answers1

2

There are few things you can change to your code. First, you really should create the insert_query string only once. It will never change in the loop. Also, it seems like you have some errors, like '?'*nr is not returning a sequence, so I correct these as well.

Using oursql

import oursql

# ...

place_holders = '(' + ','.join(['?'] * len(scur.description)) + ')'
insert_query = "INSERT IGNORE INTO `%s` VALUES %s" % (tname, place_holders)

with dest_conn.cursor() as dcur:
    while True:
        rows = scur.fetchmany(size=25)
        if not rows:
            log("No more rows found to insert")
            break

        dcur.executemany(insert_query, rows)

However, I do not see much optimisation done with the executemany() method. It will always use MySQL Prepared Statements and execute each insert one by one.

MySQL General log entries executing using oursql:

..
14 Prepare  SELECT * FROM t1
14 Execute  SELECT * FROM t1
15 Prepare  INSERT INTO `t1copy` VALUES (?)
15 Execute  INSERT INTO `t1copy` VALUES (1)
15 Execute  INSERT INTO `t1copy` VALUES (2)
15 Execute  INSERT INTO `t1copy` VALUES (3)
..

Using MySQL Connector/Python

If you use MySQL Connector/Python (note, I'm the maintainer), you'll see different queries going to the MySQL server. Here's the similar code, but reworked so it runs with mysql.connector:

import mysql.connector

# ...

place_holders = ','.join(['%s'] * len(scur.description))
place_holders_list = ', '.join([place_holders] * len(scur.description))
insert_query = "INSERT INTO `{0}` VALUES ({1})".format(tname, place_holders_list)

dcur = dest_conn.cursor()
while True:
    rows = scur.fetchmany(size=25)
    if not rows:
        log("No more rows found to insert")
        break

    dcur.executemany(insert_query, rows)
    dest_conn.commit()

MySQL General log entries executing using mysql.connector:

..
18 Query    SELECT * FROM t1
19 Query    INSERT INTO `t1copy` VALUES (1),(2),(3),(4),(5),(6),(1),(2),(3),(4),(5),(6)
19 Query    COMMIT

What is faster will have to be benchmarked. oursql is using the MySQL C library; MySQL Connector/Python is pure Python. The magic to make the optimised insert is thus also pure Python string parsing, so you'll have to check it.

Conclusion

oursql is not optimising the INSERT statement itself. Instead, executemany() is only creating the MySQL Prepared Statement once. So that's good.

geertjanvdk
  • 3,440
  • 24
  • 26
  • If you look carefully my code you'll see it actually generates multiple row insert statement in `insert_query`. `place_holders_list` does that. And as I am fetching `25` rows the number of rows in `place_holders_list` will vary. – Shiplu Mokaddim Jun 04 '14 at 09:52