4

I have a CSV file which is approx 350GB. It is stored on the postgres server. I am attempting to copy it to a table in my DB using:

COPY <table_name>(<column1>, <column2>,...)
FROM '<CSV_location>'
DELIMITER ','
CSV HEADER;

The only indexing occuring is a primary key on one of the columns.

I am uncertain of the progress however the table is currently approx 150GB. It has taken around 5 days, and it seems nowhere near completion.

What could be the issue here?

UPDATE Two issues that I have come across and resolved.

  1. Database storage path was incorrect and the volume was very close to capacity or had already reached it (99% full - 6GB left). I am not sure why I was not getting any notification of this. I have since changed the storage path to a new one with much more availible capacity.

  2. I am no longer applying the primary key constraint on the hash column during table creation or pre copying of data. The copying of the data seems to take significantly longer when doing this.

It took approx 1 hour to copy the 350GB of CSV data to the DB and hundreds of millions of rows, without any primary key index.

  • 2
    Use system monitoring tools, like `top` or `vmstat`, to see what is going on and if you are IO limited or CPU limited. – jjanes Mar 15 '21 at 03:09
  • Are the rows in the csv file sorted by the primary key column? – jjanes Mar 15 '21 at 03:10
  • @jjanes The primary key column is a hash. So no, not sorted by hash. – aquarianmoon Mar 15 '21 at 04:26
  • @jjanes The only thing I can see is the postgres process has PR 20 and NI 0 while other processes related to ZFS have PR 1 and NI -19 – aquarianmoon Mar 15 '21 at 05:00
  • 1
    Could be slow storage or a lock. – Laurenz Albe Mar 15 '21 at 07:11
  • @aquarianmoon if possible: it is customary that any constraints, e.g. pk, is deleted **before** importing such a large amount of data. Right after import you can recreate them. Ottherwise PostgreSQL will need to check / resort the table after every insert. Also, consider using `unlogged tables` if possible: https://stackoverflow.com/a/59590141/2275388 – Jim Jones Mar 15 '21 at 13:36

0 Answers0