As a sort of exercise, I'm seeing how fast I can insert bulk records into SQLite. The data set is about 50MB and contains 1M rows. Here is what I currently have:
sqlite3 *db;
int rc = sqlite3_open("MyDB.db", &db);
sqlite3_exec(db, "BEGIN TRANSACTION", NULL, NULL, NULL);
char* sql_buffer = malloc(200 * sizeof(char));
for (int i=0; item=row[i]; i ++) {
snprintf(sql_buffer, 200, "insert into myTable (id, format, size) VALUES (%d, '%s', %d)", item.id, item.format, item.size);
rc = sqlite3_exec(db, sql_buffer, NULL, NULL, NULL);
}
sqlite3_exec(db, "COMMIT TRANSACTION", NULL, NULL, NULL);
Doing the above 1M inserts, it takes 3.39s
. About 90% of that time is the SQLite inserts and 10% is the snprintf
function. I tried the following to see if it would increase speed:
- Doing inserts after every 10K, 50K, 100K, instead of at the end (1M)
- Writing to memory instead of a file.
- Changing various pragmas, for example:
PRAGMA cache_size = 400000; PRAGMA synchronous = OFF; PRAGMA journal_mode = OFF;
...
None of those seemed to have more than any effect of more than a 0.1s
difference or so.
Are there any further ways that I could increase the insert speed here? If we assume the file is "parsed" and cannot just be loaded directly from something like a csv file, could it theoretically be possible to insert 1M rows in under 1s? If not, what is the limitation in doing something like that?