4

I have the following table :

persons(id,id_tech,name,nationality,id_list)

And a CSV file containing the datas for the column id_tech, name, and nationality.

Importing the data like so is working :

\copy persons(id_tech,name,nationality) FROM '/path/to/file.csv' DELIMITER ',' CSV

The id_list value is not in the CSV file because the file has to be imported on many servers where the value of this id can be different.

Is there a way to import the CSV file while providing an additional value for a specific column? I can't find anything relevant on the documentation.

Edit 1--
Note that all my command will be perform with pqxx in C++ (multi platform). I'm trying to avoid editing the file because of it's size.

Edit 2 --
I'm considering the following approach:

  • Create a temp table with a correct default value for the field I need
  • Import File into this temp table
  • Copy temp table to final table
  • Remove temp table

But I'm unsure of the performance. Biggest import can be close to a 500K lines.

Kevin M. Mansour
  • 2,915
  • 6
  • 18
  • 35
grunk
  • 14,718
  • 15
  • 67
  • 108

2 Answers2

2

On Linux you could use for example awk to add the additional value field to your data and use psql to read from from stdin:

$ cat copy.sql
\copy persons(id_tech,name,nationality,extra_col) FROM '/dev/stdin' DELIMITER ',' CSV

and then:

$ awk '
BEGIN {
    FS=OFS=","
}
{
    print $1,$2,$3,"additional value"
}' file.csv | psql -h host -d database -f file.sql

(tested on PostgreSQL 12.4)

James Brown
  • 36,089
  • 7
  • 43
  • 59
  • 1
    Thanks for you answer but i'n interacting with postgres in c++ not only under Linux. I could modifiy the file in C++ but due to their size i 'd like to avoid that – grunk Dec 03 '20 at 16:04
  • If you can't read from stdin and need to read from a file, I guess you just have to edit the file before loading or load to a staging table first. Good luck. – James Brown Dec 03 '20 at 16:08
2

Found a solution which seems more than ok.

As stated in my OP , i use libpqxx to insert data , so instead of running a COPY sql request directly , i simply use pqxx::stream_to

Which allow me to add addtional field if needed :

pqxx::stream_to stream(w, mTable, std::vector<std::string>{"id_tech","name","nationality","extra_col"});
csv::CSVReader reader(filePath);
for (csv::CSVRow& row : reader) {

    stream << std::make_tuple(row[0].get<long long>(), row[1].get<std::string>(), row[2].get<std::string>(), custom_id);
}
stream.complete();

Taking around 10s to import 300K lines which is fine for my needs.

grunk
  • 14,718
  • 15
  • 67
  • 108