0

I am using postgreSQL version 13.0.1. (I have to use postgre)

Table Structure

CREATE TABLE public.orders
(
    key integer NOT NULL,
    skey integer NOT NULL,
    status character(1) COLLATE pg_catalog."default" NOT NULL,
    price numeric(15,2) NOT NULL,
    date date NOT NULL,
    priority character(15) COLLATE pg_catalog."default" NOT NULL,
    man character(15) COLLATE pg_catalog."default" NOT NULL,
    priority2 integer NOT NULL,
    comment character varying(79) COLLATE pg_catalog."default" NOT NULL
)

TABLESPACE pg_default;

ALTER TABLE public.orders
    OWNER to postgres;

CSV File contents. the first line is the header, the second line is data:

key ,skey,  status , price , date       ,priority2        ,man,             priority   ,comment
1   ,2   ,  3      , 45.33 , 1992-02-21 ,4-NOT SPECIFIED  ,man#000000058  , 0          ,ggle. so, whatever you say.

My Query

COPY public.orders FROM 'D:\orders.csv' WITH (format csv,Header);

The Error:

ERROR:  extra data after last expected column
CONTEXT:  COPY orders, line 3: "1,2,3,45.33,1992-02-21,4-NOT SPECIFIED,man#000000058,0,ggle. so, whatever you say.
"
SQL state: 22P04

Error is probably due to the last column that has a string (ggle. so, whatever you say.) with no " " to demark it as a string. It also has a comma in it.

My desired output is to have the query complete successfully with the last column read as a whole string. I am also not allowed to change the data in the file.

I have tried advice from these links:
Importing CSV and commas in string values
https://www.postgresql.org/docs/13/sql-copy.html

but I still ended up with the query and error as mentioned above. Please enlighten me.

xyiong
  • 363
  • 1
  • 10
  • similar to: https://stackoverflow.com/questions/1120109/how-to-export-table-as-csv-with-headings-on-postgresql – Christophe Roussy Nov 02 '20 at 14:29
  • @ChristopheRoussy Thank you for your response. However, my query reflects the answer to thread. But I still get the error I mentioned above. – xyiong Nov 02 '20 at 14:51
  • The best option is to fix the csv. If there are commas in the fields, you will need quotes around the field. However, it looks like this might actually be fixed width? If that's true, you could import the whole row and then break it up using substr. – Jeremy Nov 02 '20 at 15:20

2 Answers2

1

I would do this in an indirect manner. Here is an illustration:

  1. Create a temporary table with one text column only
create temporary table dropme_later (full_line text);
  1. Read the contents of the text file into dropme_later, each full line into column full_line
copy dropme_later from 'D:\orders.csv' 
with
(
 format 'csv',
 header 'true',
 delimiter E'\b'
);

Note that the delimiter is set to a character that does not occur in the data, i.e. backspace;

  1. Transfer data from dropme_later into public.orders with a query and a transformation for the last column. This gives you a lot of freedom and flexibility to use select expressions, a where clause, whatever. In this case first split each line into an array and then use array expressions.
with t(a) as 
(
 select string_to_array(full_line, ',') from dropme_later
)
insert into public.orders 
select
  a[1], a[2], a[3], a[4], a[5], a[6], a[7], a[8],
  array_to_string(a[9:11], ',')
from t;

Hope this helps. A hint - there is a fine FDW - file_text_array_fdw - that may help a lot in ETL cases.

Stefanov.sm
  • 11,215
  • 2
  • 21
  • 21
  • Thank you for the response! I tried out your suggestion and got an error saying: ERROR: column "key" is of type integer but expression is of type text LINE 15: a[1], a[2], a[3], a[4], a[5], a[6], a[7], a[8], ^ HINT: You will need to rewrite or cast the expression. After recasting, I get an error saying: invalid input syntax for type integer: "O" – xyiong Nov 02 '20 at 15:38
  • @xyiong Sure, cast all expressions that need to yield numbers or a date accordingly, i.e. a[1]::integer, a[2]::integer, a[4]::numeric, a[5]::date, a[8]::integer. Please note that my answer is an illustration, the specifics remain to be done. – Stefanov.sm Nov 02 '20 at 15:43
-2

do this, table_name# copy table_name from 'path/...csv' with csv header;

this is passing csv header as a delimeter.

Som Dc
  • 1