9

I have a CSV file with some integer column, now it 's saved as "" (empty string).

I want to COPY them to a table as NULL value.

With JAVA code, I have try these:

String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',',  HEADER true)";
String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',', NULL ''  HEADER true)";

I get: PSQLException: ERROR: invalid input syntax for type numeric: ""

String sql = "COPY " + tableName + " FROM STDIN (FORMAT csv,DELIMITER ',', NULL '\"\"'  HEADER true)";

I get: PSQLException: ERROR: CSV quote character must not appear in the NULL specification

Any one has done this before ?

Hieudien
  • 95
  • 1
  • 1
  • 9

3 Answers3

16

I assume you are aware that numeric data types have no concept of "empty string" ('') . It's either a number or NULL (or 'NaN' for numeric - but not for integer et al.)

Looks like you exported from a string data type like text and had some actual empty string in there - which are now represented as "" - " being the default QUOTE character in CSV format.

NULL would be represented by nothing, not even quotes. The manual:

NULL

Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format.

You cannot define "" to generally represent NULL since that already represents an empty string. Would be ambiguous.

To fix, I see two options:

  1. Edit the CSV file / stream before feeding to COPY and replace "" with nothing. Might be tricky if you have actual empty string in there as well - or "" escaping literal " inside strings.

  2. (What I would do.) Import to an auxiliary temporary table with identical structure except for the integer column converted to text. Then INSERT (or UPSERT?) to the target table from there, converting the integer value properly on the fly:

-- empty temp table with identical structure
CREATE TEMP TABLE tbl_tmp AS TABLE tbl LIMIT 0;

-- ... except for the int / text column
ALTER TABLE tbl_tmp ALTER col_int TYPE text;

COPY tbl_tmp ...;

INSERT INTO tbl  -- identical number and names of columns guaranteed
SELECT col1, col2, NULLIF(col_int, '')::int  -- list all columns in order here
FROM   tbl_tmp;

Temporary tables are dropped at the end of the session automatically. If you run this multiple times in the same session, either just truncate the existing temp table or drop it after each transaction.

Related:

Zachary Ryan Smith
  • 2,688
  • 1
  • 20
  • 30
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
14

Since Postgres 9.4 you now have the ability to use FORCE_NULL. This causes the empty string to be converted into a NULL. Very handy, especially with CSV files (actually this is only allowed when using CSV format).

The syntax is as follow:

COPY table FROM '/path/to/file.csv' 
WITH (FORMAT CSV, DELIMITER ';', FORCE_NULL (columnname));

Further details are explained in the documentation: https://www.postgresql.org/docs/current/sql-copy.html

Camilo Silva
  • 8,283
  • 4
  • 41
  • 61
moojen
  • 1,146
  • 11
  • 18
  • 7
    Option `FORCE_NULL` **is** with underscore and should be specified inside the "WITH (...)" clause. For example: `COPY table FROM '/path/to/file.csv' WITH (FORMAT CSV, DELIMITER ';', FORCE_NULL (field1, field2, field3));` – spatar Jul 13 '20 at 05:16
  • @spatar You're right, indeed this is the current preferred syntax. The syntax I've used is still supported, even in version 13, but nonetheless it makes more sense to use the 'standard syntax'. I've corrected my example, thanks! – moojen Nov 20 '20 at 14:14
  • 1
    is there a way to tell force_null ALL fields, without listing them out? Something like FORCE_NULL (*)? – Zachary Ryan Smith Nov 20 '20 at 20:06
  • @ZacharyRyanSmith It appears that this is only possible for FORCE_QUOTE, but you can of course try to see if it works. – moojen Nov 24 '20 at 11:18
0

If we want to replace all blank and empty rows with null then you just have to add emptyasnull blanksasnull in copy command

syntax :

    copy Table_name (columns_list)
    from 's3://{bucket}/{s3_bucket_directory_name + manifest_filename}'
    iam_role '{REDSHIFT_COPY_COMMAND_ROLE}' emptyasnull blanksasnull 
    manifest DELIMITER ',' IGNOREHEADER 1 compupdate off csv gzip;

Note: It will apply for all the records which contains empty/blank values

helvete
  • 2,455
  • 13
  • 33
  • 37