2

I have an input CSV file containing something like:

SD-32MM-1001,"100.00",4/11/2012
SD-32MM-1001,"1,000.00",4/12/2012

I was trying to COPY import that into a postgresql table(varchar,float8,date) and ran into an error:

# copy foo from '/tmp/foo.csv' with header csv;
ERROR:  invalid input syntax for type double precision: "1,000.00"
Time: 1.251 ms

Aside from preprocessing the input file, is there some setting in PG that will have it read a file like the one above and convert to numeric form in COPY? Something other than COPY?

If preprocessing is required, can it be set as part of the COPY command? (Not the psql \copy)?

Thanks a lot.

Dinesh
  • 4,437
  • 5
  • 40
  • 77
  • Just FYI, there's currently work ongoing to enable input and output filters on `COPY` and `\copy`, but right now your only options are to pre-process the file with some external script or to import that column into a temp table as text then insert a cleaned up version into the real table. – Craig Ringer Apr 20 '13 at 10:34
  • Thanks for the heads up. Having options for hooks will surely help. – Dinesh Apr 22 '13 at 23:25

2 Answers2

3

The option to pre processing is to first copy to a temporary table as text. From there insert into the definitive table using the to_number function:

select to_number('1,000.00', 'FM000,009.99')::double precision;
Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
  • Thanks Clodoaldo. I am already doing that and don't really like it. In addition to writing the same thing over and again for multiple of data interchange files (and the maintenance costs), this also requires foreknowledge of the actual input format. I was hoping for something more like a locale-aware atod() that postgresql could be made to try on exceptions before just giving up. (I have a second, separate question on linux side: please see http://stackoverflow.com/questions/16110601/how-to-get-awk-cut-etc-to-not-peek-for-fs-inside-quoted-strings. Thanks.) – Dinesh Apr 19 '13 at 17:47
2

It's an odd CSV file that surrounds numeric values with double quotes, but leaves values like SD-32MM-1001 unquoted. In fact, I'm not sure I've ever seen a CSV file like that.

If I were in your shoes, I'd try copy against a file formatted like this.

"SD-32MM-1001",100.00,4/11/2012
"SD-32MM-1001",1000.00,4/12/2012

Note that numbers have no commas. I was able to import that file successfully with

copy test from '/fullpath/test.dat' with csv

I think your best bet is to get better formatted output from your source.

Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185
  • Mike, thanks but I wish... FWIW I _have_ to deal with this style. CSVs like that are in my experience fairly common and definitely well-formed. – Dinesh Apr 20 '13 at 00:16
  • 1
    @user1397919 "Well formed CSV" is a bit like "tree shaped plant". There are rather a lot of different shapes of tree ;-) – Craig Ringer Apr 20 '13 at 10:33