2

I'm trying to push (with COPY) a big file from s3 to Redshift. Im using sqlalchemy in python to execute the sql command but it looks that the copy works only if I preliminary TRUNCATE the table.

the connection works ok:

from sqlalchemy import create_engine 
engine = create_engine('postgresql://XXXX:XXXX@XXXX:XXXX/XXXX') 

with this command string (if I truncate the table before the COPY command)

toRedshift = "TRUNCATE TABLE public.my_table; COPY public.my_table from 's3://XXXX/part-p.csv' CREDENTIALS 'aws_access_key_id=AAAAAAA;aws_secret_access_key=BBBBBBB' gzip removequotes IGNOREHEADER 0 delimiter '|';"
engine.execute(toRedshift)

If I remove the "TRUNCATE TABLE public.my_table;" bit

toRedshift = "COPY public.my_table from 's3://XXXX/part-p.csv' CREDENTIALS 'aws_access_key_id=AAAAAAA;aws_secret_access_key=BBBBBBB' gzip removequotes IGNOREHEADER 0 delimiter '|';"
engine.execute(toRedshift)

But the command works perfectly in with any other SQL client (like DBeaver for example)

user3620915
  • 137
  • 1
  • 9
  • 1
    Pretty sure this comes down to [how autocommit works](http://docs.sqlalchemy.org/en/latest/core/connections.html#understanding-autocommit), but not sure yet how, or if at all. Neither TRUNCATE nor COPY are listed in the AUTOCOMMIT_REGEXP pattern. Try wrapping your `'COPY ...'` statement in a `text(stmt).execution_options(autocommit=True)` construct and pass that to `engine.execute()`. – Ilja Everilä Aug 18 '17 at 11:06
  • engine.execute(text(stmt).execution_options(autocommit=True)) – user3620915 Aug 18 '17 at 12:54

1 Answers1

1

Thank you Ilja. With this command it works:

engine.execute(text(toRedshift).execution_options(autocommit=True))

I don't know why I was able to push the data with the TRUNCATE bit at the front of the string.

Ivan

user3620915
  • 137
  • 1
  • 9
  • For reasons unknown it triggers the autocommit (or so it'd seem), though as I said before at least the current regexp pattern for Postgresql does not have TRUNCATE: https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/dialects/postgresql/base.py#L892 – Ilja Everilä Aug 18 '17 at 13:26
  • I think in most DB TRUNCATE is DDL command, not DML (see https://stackoverflow.com/questions/2578194/what-is-ddl-and-dml ). DDL commands trigger commit inherently. – Dolfa Aug 18 '17 at 13:43
  • 1
    @dolfa SQLA specificly handles "autocommit" on its own, with a regex match against the issued statement (or statements, though SQLA does not officially support multi statement SQL, as [in this very similar post](https://stackoverflow.com/questions/45347565/executing-multiple-statements-with-postgresql-via-sqlalchemy-does-not-persist-ch)). That regex contains ddl as well: create, alter, etc., but not truncate. – Ilja Everilä Aug 18 '17 at 18:32