Perfect solution!
Not need to know all the CSV columns, only extract what you know.
Use at SQL CREATE EXTENSION PLpythonU;
: if the command produce an error like "could not open extension control file ... No such file" you need to install pg-py extra-packages. In standard UBUNTU (16 LTS) is simple, apt install postgresql-contrib postgresql-plpython
.
CREATE FUNCTION get_csvfile(
file text,
delim_char char(1) = ',',
quote_char char(1) = '"')
returns setof text[] stable language plpythonu as $$
import csv
return csv.reader(
open(file, 'rb'),
quotechar=quote_char,
delimiter=delim_char,
skipinitialspace=True,
escapechar='\\'
)
$$;
INSERT INTO mytable(info)
SELECT jsonb_build_array(c[1],c[2],c[3])
FROM get_csvfile('/tmp/myfile1.csv') c;
The split_csv() function was defined here. The csv.reader
is very reliable (!).
Not tested for big-big CSV... But expected Python do job.
PostgreSQL workaround
It is not a perfect solution, but it solves the main problem, that is the
... big temporary table, so lost CPU, disk and my time"...
This is the way we do it, a workaround with file_fdw
!
Adopt your conventions to avoid file-copy and file-permission confusions... The standard file path for a CSV. Example: /tmp/pg_myPrj_file.csv
Initialise your database or SQL script with the magic extension,
CREATE EXTENSION file_fdw;
CREATE SERVER files FOREIGN DATA WRAPPER file_fdw;
For each CSV file, myNewData.csv
,
3.1. make a symbolic link (or scp
remote copy) for your new file ln -sf $PWD/myNewData.csv /tmp/pg_socKer_file.csv
3.2. configure the file_fdw for your new table (suppose mytable
).
CREATE FOREIGN TABLE temp1 (a int, b text, c text)
SERVER files OPTIONS (
filename '/tmp/pg_socKer_file.csv',
format 'csv',
header 'true'
);
PS: after running SQL script with psql
, when having some permission problem, change owner of the link by sudo chown -h postgres:postgres /tmp/pg_socKer_file.csv
.
3.3. use the file_fdw table as source (suppose populating mytable
).
INSERT INTO mytable(info)
SELECT json_build_array(a,b,c) FROM temp1;
Thanks to @JosMac (and his tutorial)!
NOTE: if there is a STDIN way to do it (exists??), will be easy, avoiding permission problems and use of absolute paths. See this answer/discussion.