0

I am populating a medium-sized table (60GB, 500 million rows). The process completes reasonably fast if the table has no primary key (~1 hour using bulk insert), but it takes ~10 times longer if I create that table with the primary key. I assume this is because it takes time to verify the uniqueness constraint and also update the index at each insert.

I thought a good workaround would be to add the primary key later, since indexation on the table that's already populated should be much faster compared to incremental indexation. But sqlite doesn't seem to have the option to add primary key after the table is created (not sure why?).

I guess I could just not use a primary key at all, and instead just add a unique index after the table is populated. Is there any disadvantage to that?

Or any better solution recommended?

max
  • 49,282
  • 56
  • 208
  • 355

2 Answers2

1

From a purely technical point of view, an unique index has exactly the same effect as a primary key. (In SQLite, some primary keys allow NULLs for backwards compatibility.)

The only difference is that the primary key constraint does not show up in the table definition itself, which might be a bad thing for documentation purposes.

Also see Is CREATE UNIQUE INDEX or INTEGER PRIMARY KEY more performant in SQLite.

Community
  • 1
  • 1
CL.
  • 173,858
  • 17
  • 217
  • 259
0

Run the bulk insert inside a transaction and you'll avoid quite a few things that slow inserts down.

I just found this which is a great write up on how to speed things up in sqlite3.

Improve INSERT-per-second performance of SQLite?

Community
  • 1
  • 1
Harry
  • 11,298
  • 1
  • 29
  • 43