4

I'm working on a project written in C that generates almost 350k sequences to be persisted on a sqlite database. For each sequence I have to insert (or ignore) a string in a table and update a row in other table.

I tried this "guide" but couldn't reach more than 30k operations per second.

I'm using transactions of 1M operations each (inserts and updates) and PRAGMA synchronous=OFF

What options do I have to solve this bottleneck?

Community
  • 1
  • 1
Pedro Alves
  • 1,667
  • 4
  • 17
  • 37
  • 30K/sec seems a decent throughput. The guide is using a simple test bench. Do you get the same throughput as the guide when you use its test? If yes, then it's really down to what your code does, so maybe you could post a reproducing code. – Simon Mourier May 24 '13 at 07:33

2 Answers2

5

Actually, SQLite will easily do 50,000 or more INSERT statements per second on an average desktop computer. But it will only do a few dozen transactions per second. Transaction speed is limited by the rotational speed of your disk drive. A transaction normally requires two complete rotations of the disk platter, which on a 7200RPM disk drive limits you to about 60 transactions per second.

Transaction speed is limited by disk drive speed because (by default) SQLite actually waits until the data really is safely stored on the disk surface before the transaction is complete. That way, if you suddenly lose power or if your OS crashes, your data is still safe. For details, read about atomic commit in SQLite..

By default, each INSERT statement is its own transaction. But if you surround multiple INSERT statements with BEGIN...COMMIT then all the inserts are grouped into a single transaction. The time needed to commit the transaction is amortized over all the enclosed insert statements and so the time per insert statement is greatly reduced.

Another option is to run PRAGMA synchronous=OFF. This command will cause SQLite to not wait on data to reach the disk surface, which will make write operations appear to be much faster. But if you lose power in the middle of a transaction, your database file might go corrupt.

Please check out this FAQ, it explains the insert bottleneck issue among others.

medina
  • 8,051
  • 4
  • 25
  • 24
1

I improved my database's performance by simplifying queries.

Initially i had an INSERT + UPDATE for each element that should be added to a table. With my simplification, i create another table, a temporary one, without any constraint on columns, and just insert data on it. At end i use a:

CREATE TABLE my_table AS SELECT ... FROM my_tmp_table ... GROUP BY something

and create my final table with all constraints i need.

Now it reaches the performance i need.

Pedro Alves
  • 1,667
  • 4
  • 17
  • 37