I'm trying to upload a file to vertica using the vsql CLI. My command looks like this (it's all one line, but I'm breaking it up with backslashes here for ease of reading):
vsql "-c set timezone to 'UTC'; \
copy etl.test_data1 (col1,col2,col3)\
from local 'test_data1.csv.gz' \
gzip delimiter E'\\t' \
exceptions '/home/error/etl.test_data1.err' \
NULL AS 'NULL' \
NO ESCAPE \
rejected data '/home/rejected/etl.test_data1.rejected'" \
-p5433 -hverticahostname -Uusername -ddatabase -vON_ERROR_STOP=ON
When I run that command, it executes with no error, but no rows are loaded and the data goes straight into the rejected file.
When I run this csv-load command on the un-gzipped version of the same file, it loads the data into the table without rejections:
vsql "-c set timezone to 'UTC'; \
copy etl.test_data1 (col1,col2,col3)\
from local 'test_data1.csv' \
delimiter ',' \
exceptions '/home/error/etl.test_data1.err' \
NULL AS 'NULL' \
NO ESCAPE \
rejected data '/home/rejected/etl.test_data1.rejected'" \
-p5433 -hverticahostname -Uusername -ddatabase -vON_ERROR_STOP=ON
I tried running the command with no explicit delimiter set, same result. I tried making the gzip with both the UNIX gzip command and by encoding it using the python gzip module and got the same result (just in case the file was corrupted somehow).
The file I would eventually like to upload via job is massive and csv isn't a performant option for a regular upload. Any suggestions as to why the gzip load is failing?
EDIT: Solution in comments; doing gzip on a csv doesn't change the delimiter -- still need to use ',' as the delimiter, not tab!