1

So I read this article that says inserting in blocks beats one row at a time:
Which is faster: multiple single INSERTs or one multiple-row INSERT?

And I read this article that suggests how to take rows of text and insert them into a mysql store:
How to use python mysqldb to insert many rows at once

My procedure goes like this:

  • get vat of strings from input source
  • pop out the few (10's) of chunks of interest
  • populate an array with them, one chunk to one cell in a row
  • repeat until the array is full
  • try put the rows into the store with the input command

my problem is that when I try to use the following code for an array with ~100+ rows

cur.executemany("INSERT INTO OpStatus VALUES('?',?,...,?)", my_array)

I currently have one "?" for each column in the table.

It gives me the following error.

TypeError: not enough arguments for format string

My question is:
How do I reform my array or input statement so that I can make this a batch (not row) insert into the SQLITE table?

DISCLAIMER: It has been a few years since I used python. If you need to point me in a direction, or ask clarifying questions, please feel free.

Community
  • 1
  • 1
EngrStudent
  • 1,924
  • 31
  • 46
  • 2
    Experiment with an intermediate list format. `my_array.tolist()` produces a list of lists. The sql format probably works better with a list of tuples. (but I'm a little rusty on sql calls). – hpaulj Sep 24 '15 at 20:39
  • @hpaulj - can you put that as an answer? – EngrStudent Sep 24 '15 at 20:45

2 Answers2

1

I think you are on the right track. In an interactive session I can do:

conn=sqlite3.connect(':memory:')
cur=conn.cursor()
cur.execute('''CREATE TABLE array (f0 real, f1 real, f2 real)''')

insert one row

cur.execute("INSERT INTO array VALUES (?,?,?)", np.ones((3,)))

insert 2

cur.executemany("INSERT INTO array VALUES (?,?,?)", np.zeros((2,3)))

make an array and insert it

A=np.arange(12.).reshape(4,3)
cur.executemany("INSERT INTO array VALUES (?,?,?)", A)

display

for row in cur.execute('SELECT * from array'):print (row)

In Numpy ValueError: setting an array element with a sequence I learned that fetchall() returns a list of tuples. I thought, based on that, the INSERT input also had to be a list a tuples, the kind of thing that a structured array would produce. But it looks like a 2d array will work fine, as long as the number of columns of the array match the number of ?.

Your initial question asked about using ('%s',%s,...,%s)" % my_array.

'%s,%s,%s'%tuple(A[0,:])   # ok
'%s,%s,%s'%A[0,:]   # not enough arguments error

Hence my initial comment. But apparently you found documentation that pointed to the prefered (?,?,?) syntax.

Structured arrays also work, though I have to be careful about the dtype and shape:

cur.executemany("INSERT INTO array VALUES (?,?,?)", np.ones((1,),dtype='f8,f8,f8'))
Community
  • 1
  • 1
hpaulj
  • 221,503
  • 14
  • 230
  • 353
-2

A transacton would speed up insert speed

Cur.execute( "begin;");
Cur.execute( " insert ...." );
Cur.execute( " insert ...." );
Cur.execute( " insert ...." );
Cur.execute( "commit;");

sqlite

Insert into table (col1,col2) values ( v1, v2 ), (v1,v2),...

Inserts batch rows. The quoted article talks about mysql which is structurally different (e.g no connection).

Sqlite is driven by writing safely to storage, and best way to scale up performance is to use transaction. Sqlite insert is really slow

mksteve
  • 12,614
  • 3
  • 28
  • 50