0

If I have thousands of INSERT INTO statements, would there be any significant difference in speed if I have them in one transaction or not.

In other words, are there any significant speed difference between alternative A and B.

Alternative A:

BEGIN
INSERT INTO STATEMENT 1
INSERT INTO STATEMENT 2
...
INSERT INTO STATEMENT 1000
COMMIT;

Alternative B:

INSERT INTO STATEMENT 1
INSERT INTO STATEMENT 2
...
INSERT INTO STATEMENT 1000
Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
Thomas
  • 962
  • 2
  • 9
  • 18
  • 1
    possible duplicate of [How to get high performance under a large transaction (postgresql)](http://stackoverflow.com/questions/11322029/how-to-get-high-performance-under-a-large-transaction-postgresql) – Fabian S. Dec 09 '14 at 22:03
  • use multirow `INSERT` statements – vp_arth Dec 10 '14 at 14:32

2 Answers2

1

Yes, there will be a huge speed difference, because COMMIT is expensive. It requires a flush to disk. Unless you have fast write-back caching (battery-backed write-back caching RAID controller, or a good SSD), this takes quite a while. pg_test_fsync will help you tell how long.

Batch the work into large transactions. If you need to rollback individual inserts you can use SAVEPOINTs - though they come at somewhat of a performance cost, they're not as expensive as full transactions, especially if you RELEASE SAVEPOINT them once the next is created.

There are diminishing returns with transaction size, so there's no point doing millions and millions in a single tx unless you need that for atomicity. Batch into decent sized chunks. The optimum size depends mostly on how expensive disk flushes are on your storage, but I'd aim for a few thousand rows to start with.

Better yet, use COPY, which will be much faster, especially if network round-trips are involved.

For more detail see How to speed up insertion performance in PostgreSQL

Community
  • 1
  • 1
Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
0

If there is no need for a rollback, in your specific situation, using transactions would actually be faster.

The reason for this is transactions, by design, will submit your queries in batches. Using transactions has little overhead most of the time (unless your transaction is really huge).

Dany Caissy
  • 3,176
  • 15
  • 21
  • Thanks a lot for your reply! What is a huge transaction? Is a transaction with thousand INSERT INTO statements not considered a huge transaction? – Thomas Dec 09 '14 at 22:01
  • 1000 isnt much. Ive build an import script which did 10k inserts in less then a second using one transaction and multi-row insert command. – Fabian S. Dec 09 '14 at 22:05
  • What does rollback have to do with it? Or do you mean "rollback of individual inserts" ? – Craig Ringer Dec 09 '14 at 22:58