1

We have a file of 150M lines which updates only one table of postgresql database with such commands:

UPDATE "events" SET "value_1" = XX, "value_2" = XX, "value_3" = XX, "value_4" = XX WHERE "events"."id" = SOME_ID;

All id's are unique, there's no way to apply that update to several events. Currently such update takes approx few days if we run this with \i update.sql in psql.

Is there any faster way to run it?

nateless
  • 475
  • 6
  • 23

1 Answers1

3
  • Simplest: add set synchronous_commit=off before \i update.sql

  • Better:

    • Split the file to parts of like 100000 updates:
      split -l 100000 -a 6 --additional-suffix=.sql update.sql update-part
    • Run these updates in parallel, each file in single transaction, for example with:
      /bin/ls update-part*.sql \ | xargs --max-procs=8 --replace psql --single-transaction --file={}
Tometzky
  • 22,573
  • 5
  • 59
  • 73