What is the fastest way to apply 150M updates on PostgreSQL table

Question

We have a file of 150M lines which updates only one table of postgresql database with such commands:

UPDATE "events" SET "value_1" = XX, "value_2" = XX, "value_3" = XX, "value_4" = XX WHERE "events"."id" = SOME_ID;

All id's are unique, there's no way to apply that update to several events. Currently such update takes approx few days if we run this with \i update.sql in psql.

Is there any faster way to run it?

Did you try to run that with autocommit _disabled_ as a single transaction? — , Jul 06 '16 at 20:59
How would it affect DB as it has several hundreds ops per second for that table, can we screw up data or lock table for the whole update? — nateless, Jul 06 '16 at 21:03

Tometzky · Accepted Answer · 2016-07-07T07:16:25.520

3

Simplest: add set synchronous_commit=off before \i update.sql
Better:
- Split the file to parts of like 100000 updates:
  split -l 100000 -a 6 --additional-suffix=.sql update.sql update-part
- Run these updates in parallel, each file in single transaction, for example with:
  /bin/ls update-part*.sql \ | xargs --max-procs=8 --replace psql --single-transaction --file={}

edited Jul 07 '16 at 07:16

answered Jul 07 '16 at 07:06

Tometzky

22,573
5
59
73

What is the fastest way to apply 150M updates on PostgreSQL table

1 Answers1

Linked