1

I am trying to copy from azure data lake gen2 to table in azure synapse warehouse using local ssms. The COPY INTO statement is neither throwing any errors and nor dumping the data. I am copying the pandas df from centos server to azure data lake gen2 using sep=',', encoding='utf-8'. Here is the COPY statement that I am using.

COPY INTO dbo.SALES_CUTOMER_D 
FROM 'https://acoount_name/test-file-system/SALES_CUSTOMER_D_0.csv'
WITH (
 FILE_TYPE = 'csv',
 CREDENTIAL=(IDENTITY= 'Storage Account Key', SECRET=''),
 FIELDQUOTE = '"',
 FIELDTERMINATOR=',',
 ROWTERMINATOR='\r\n',
 ENCODING = 'UTF8',
 FIRSTROW = 2
)
Kashyap
  • 15,354
  • 13
  • 64
  • 103
user8112786
  • 38
  • 1
  • 7
  • Try ERRORFILE option to see if it prints any errors to ERRORFILE. Remove the `FIELDQUOTE`, `FIELDTERMINATOR`, `ROWTERMINATOR`, you're using default values. It could have an effect. Beyond that you'll have to put the contents of the csv file and DDL for `SALES_CUTOMER_D` if you want someone to help you more. Also a good idea to specify the list of columns (`COPY INTO dbo.SALES_CUTOMER_D (col1, col2, ...) FROM ...`). – Kashyap Jul 10 '20 at 19:14
  • @Kashyap yes i tried it for sample csv called employee.csv wirh 10 rows with schema id int and name varchar 100 even for taht also same thing is happening – user8112786 Jul 13 '20 at 10:01
  • also when am trying to write errors to error file again it is throwing the error that access is denied error code 5 – user8112786 Jul 13 '20 at 10:03
  • For ERROR FILE, provide full URL and for ERROR FILE CREDENTIAL use sas token not account key. Also run unix2dos on this input CSV file and try. – Kashyap Jul 13 '20 at 12:49

1 Answers1

0

Check if your file has Unix-style line endings (LF) instead of Windows-style (CRLF).

See Difference between CR LF, LF and CR line break types? if you're not clear on CRLF.

Easiest way I know of checking is to open file in vi in binary mode with set list:

vi -b -c 'set list' <file>

To verify if this is the problem or not, you can do one of the following:

  1. Tell COPY what line endings are in your file:

     COPY INTO dbo.SALES_CUTOMER_D 
     FROM 'https://acoount_name/test-file-system/SALES_CUSTOMER_D_0.csv'
     WITH (
        FILE_TYPE = 'csv',
        CREDENTIAL=(IDENTITY= 'Storage Account Key', SECRET=''),
        ROWTERMINATOR='0x0A',
        FIRSTROW = 2
     )
    
  2. Confirm that it's actually reading the file by making it parse the header. Remove FIRSTROW = 2.

  3. Change line-endings:

  • unix2dos <csv file>
  • upload to datalake and try COPY again, without ROWTERMINATOR='\r\n', (that's the default value).

A little gotcha:

COPY treats ‘\n’ as ‘\r\n’ internally. For more information, see the ROWTERMINATOR section.

In other words:

  • If we don’t specify ROWTERMINATOR option or specify ROWTERMINATOR=’\n’ or ROWTERMINATOR=’0x0D0A’, then the engine uses \r\n as terminator (Windows style).
  • If we specify ROWTERMINATOR=’0x0A’ then engine uses ‘\n’ as the terminator (Unix style)
Kashyap
  • 15,354
  • 13
  • 64
  • 103