2

I have a table in Postgres that I would like to copy into from a csv file. I usually do as so:

\copy my_table from '/workdir/some_file.txt' with null as 'NULL' delimiter E'|' csv header;

The problem is now however that my_table has one column extra that I would like to fill in manually on copy, with the same value 'b'. Here are my tables:

some_file.txt:
col1 | col2 | col3
  0     0      1
  0     1      3

my_table :
xtra_col | col1 | col2 | col3
   a        5      2       5
   a        6      2       5
   a        7      2       5

Desired my_table after copy into:
xtra_col | col1 | col2 | col3
   a        5      2       5
   a        6      2       5
   a        7      2       5
   b        0      0       1
   b        0      1       3

Is there a way to mention the persisting 'b' value in the copy statement for column `xtra_col'. If not, how should I approach this problem?

jarlh
  • 42,561
  • 8
  • 45
  • 63
callmeGuy
  • 944
  • 2
  • 11
  • 28

2 Answers2

5

I usually load a file into a temporary table then insert (or update) from there. In this case,

CREATE TEMP TABLE input (LIKE my_table);
ALTER TABLE input DROP xtra_col;

\copy input from 'some_file.txt' ...

INSERT INTO my_table
SELECT 'b', * FROM input;

The INSERT statement looks tidy, but that can only really be achieved when the columns you want to exclude are on either end of my_table. In your (probably simplified) example, xtra_col is at the front so we can quickly append the remaining columns using *.

If the arrangement of CSV file columns differs my_table much more than that, you'll need to start typing out column names.

Chris Bandy
  • 1,498
  • 13
  • 7
3

You could set a (temporary) default value for the xtra_col:

ALTER TABLE my_table ALTER COLUMN xtra_col SET DEFAULT 'b';
COPY my_table (col1, col2, col3) FROM '/workdir/some_file.txt' WITH (FORMAT CSV, DELIMITER '|', NULL 'NULL', HEADER true);
ALTER TABLE my_table ALTER COLUMN xtra_col DROP DEFAULT;

is there a way to not repeat columns in my_table? the real my_table has 20 columns and i wouldnt want to call all of them.

If my_table has a lot of columns and you wish to avoid having to type out all the column names, you could dynamically generate the COPY command like this:

SELECT format($$COPY my_table(%s) FROM '/workdir/some_file.txt' WITH (FORMAT CSV, DELIMITER '|', NULL 'NULL', HEADER true);$$
              , string_agg(quote_ident(attname), ','))
FROM   pg_attribute
WHERE attrelid = 'my_table'::regclass
    AND attname != 'xtra_col'
    AND attnum > 0

you could then copy-and-paste the SQL to run it.


Or, for totally hands-free operation, you could create a function to generate the SQL and execute it:

CREATE OR REPLACE FUNCTION test_func(filepath text, xcol text, fillval text)
RETURNS void
LANGUAGE plpgsql
AS $func$
DECLARE sql text;
BEGIN
  EXECUTE format($$ALTER TABLE my_table ALTER COLUMN %s SET DEFAULT '%s';$$, xcol, fillval);

  SELECT format($$COPY my_table(%s) FROM '%s' WITH (FORMAT CSV, DELIMITER '|', NULL 'NULL', HEADER true);$$
        , string_agg(quote_ident(attname), ','), filepath)
  INTO sql
  FROM   pg_attribute
  WHERE attrelid = 'my_table'::regclass
      AND attname != 'xtra_col'
      AND attnum > 0;
  EXECUTE sql;

  EXECUTE format($$ALTER TABLE my_table ALTER COLUMN %s DROP DEFAULT;$$, xcol);
END;
$func$;

SELECT test_func('/workdir/some_file.txt', 'xtra_col', 'b');

This is the sql I used to test the solution above:

DROP TABLE IF EXISTS test;
CREATE TABLE test (
    xtra_col text
    , col1 int
    , col2 int
    , col3 int
);
INSERT INTO test VALUES
('a', 5, 2, 5)
, ('a', 6, 2, 5)
, ('a', 7, 2, 5);

with the contents of /tmp/data being

col1 | col2 | col3
  0  |   0  |   1
  0  |   1  |   3

Then

SELECT test_func('/tmp/data', 'xtra_col', 'b');
SELECT * FROM test;

results in

+----------+------+------+------+
| xtra_col | col1 | col2 | col3 |
+----------+------+------+------+
| a        |    5 |    2 |    5 |
| a        |    6 |    2 |    5 |
| a        |    7 |    2 |    5 |
| b        |    0 |    0 |    1 |
| b        |    0 |    1 |    3 |
+----------+------+------+------+
(5 rows)

Regarding the pg.dropped column:

The test_func call does not seem to produce the pg.dropped column, at least on the test table used above:

unutbu=# SELECT *
    FROM pg_attribute
    WHERE attrelid = 'test'::regclass;

+----------+----------+----------+---------------+--------+--------+----------+-------------+-----------+----------+------------+----------+------------+-----------+-------------+--------------+------------+-------------+--------------+--------+------------+---------------+
| attrelid | attname  | atttypid | attstattarget | attlen | attnum | attndims | attcacheoff | atttypmod | attbyval | attstorage | attalign | attnotnull | atthasdef | attidentity | attisdropped | attislocal | attinhcount | attcollation | attacl | attoptions | attfdwoptions |
+----------+----------+----------+---------------+--------+--------+----------+-------------+-----------+----------+------------+----------+------------+-----------+-------------+--------------+------------+-------------+--------------+--------+------------+---------------+
|    53393 | tableoid |       26 |             0 |      4 |     -7 |        0 |          -1 |        -1 | t        | p          | i        | t          | f         |             | f            | t          |           0 |            0 |        |            |               |
|    53393 | cmax     |       29 |             0 |      4 |     -6 |        0 |          -1 |        -1 | t        | p          | i        | t          | f         |             | f            | t          |           0 |            0 |        |            |               |
|    53393 | xmax     |       28 |             0 |      4 |     -5 |        0 |          -1 |        -1 | t        | p          | i        | t          | f         |             | f            | t          |           0 |            0 |        |            |               |
|    53393 | cmin     |       29 |             0 |      4 |     -4 |        0 |          -1 |        -1 | t        | p          | i        | t          | f         |             | f            | t          |           0 |            0 |        |            |               |
|    53393 | xmin     |       28 |             0 |      4 |     -3 |        0 |          -1 |        -1 | t        | p          | i        | t          | f         |             | f            | t          |           0 |            0 |        |            |               |
|    53393 | ctid     |       27 |             0 |      6 |     -1 |        0 |          -1 |        -1 | f        | p          | s        | t          | f         |             | f            | t          |           0 |            0 |        |            |               |
|    53393 | xtra_col |       25 |            -1 |     -1 |      1 |        0 |          -1 |        -1 | f        | x          | i        | f          | f         |             | f            | t          |           0 |          100 |        |            |               |
|    53393 | col1     |       23 |            -1 |      4 |      2 |        0 |          -1 |        -1 | t        | p          | i        | f          | f         |             | f            | t          |           0 |            0 |        |            |               |
|    53393 | col2     |       23 |            -1 |      4 |      3 |        0 |          -1 |        -1 | t        | p          | i        | f          | f         |             | f            | t          |           0 |            0 |        |            |               |
|    53393 | col3     |       23 |            -1 |      4 |      4 |        0 |          -1 |        -1 | t        | p          | i        | f          | f         |             | f            | t          |           0 |            0 |        |            |               |
+----------+----------+----------+---------------+--------+--------+----------+-------------+-----------+----------+------------+----------+------------+-----------+-------------+--------------+------------+-------------+--------------+--------+------------+---------------+
(10 rows)

As far as I know, the pg.dropped column is a natural result of how PostgreSQL works when a column is dropped. So no fix is necessary. Rows whose attname contains pg.dropped also have a negative attnum.

This is why attnum > 0 was used in test_func -- to remove such rows from the generated list of column names.

My experience with Postgresql is limited, so I might be wrong. If you can produce an example which generates a pg.dropped "column" with positive attnum, I'd very much like to see it.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • that look interesting, i'll have to try it out. One small thing, is there a way to not repeat columns in my_table? the real my_table has 20 columns and i wouldnt want to call all of them – callmeGuy Mar 21 '19 at 14:11
  • there is a problem with your function because it replaces 'xtra_col' with '...pg.dropped27...'. It looks like it is not replacing the column with nothing, but replacing it with the number. Any ideas how I can fix this? – callmeGuy Mar 25 '19 at 13:58
  • AFAIK, `pg.dropped` "columns" occur when a column is dropped from the table and are a [natural part of how PostgreSQL operates](https://stackoverflow.com/a/43050467/190597), so they shouldn't cause any problem. I believe `pg_attribute.attnum` should always be negative when the `pg_attribute.attname` contains `pg.dropped`, so those rows can handled by using the condition `attnum > 0`. Moreover, since `test_func` does not drop any columns, I don't think it is generating the `pg.dropped` column. – unutbu Mar 25 '19 at 14:30
  • I added code showing how I tested `test_func` and the results I am seeing -- no `pg.dropped` column. If you can show a setup where `test_func` generates a `pg.dropped` column I'd very much like to see it. – unutbu Mar 25 '19 at 14:30
  • ya, i don't know what the problem is, anyways, I've addedAND NOT attname LIKE '%.....% and seems to solve it. But the other error I am bumping into now is the could not open file "/some_filepath", though file exists. Do you happen to know what might be causing this? – callmeGuy Mar 25 '19 at 15:27
  • Sorry, I don't know. You might want to look at [this (for Windows)](https://dba.stackexchange.com/q/29767/2037) or [this (for Linux/Mac)](https://stackoverflow.com/q/19463074/190597). – unutbu Mar 25 '19 at 17:12
  • the problem with the filepath not found issue is that /copy shoul be used over COPY. But now I have another problem since I am using '\', which is a syntax error – callmeGuy Mar 26 '19 at 09:16
  • https://dba.stackexchange.com/questions/189158/postgresql-psql-client-side-commands-and-new-lines-copy see the backclash comment – callmeGuy Mar 26 '19 at 09:16