I've got 500K rows I want to insert into PostgreSQL using SQLAlchemy.
For speed, I'm inserting them using session.bulk_insert_mappings()
.
Normally, I'd break up the insert into smaller batches to minimize session
bookkeeping. However, bulk_insert_mappings()
uses dicts
and bypasses a lot of the traditional session bookkeeping.
Will I still see a speed improvement if I break the insert up into smaller discrete batches, say doing an insert every 10K rows?
If so, should I close the PG transaction after every 10K rows, or leave it open the whole time?