1

I am trying to unload snowflake data to S3, I have storage integration setup for the same. I could unload using SQL query, but wanted to do that using snowpark python.

DataFrameWriter.copy_into_location - this snowpark method does not have any parameter for storage_integration, which leaves me clue less on how to get this unload job done with snowpark!

Any help on this would be highly appreciated!

Tried using the existing copy_into_location method, with storage_integration='SI_NAME', which the internal SQL query thrown an error -

Invalid value ''SI_NAME'' for property 'STORAGE_INTEGRATION'. String literal identifier is unsupported for this property. Please use an unquoted or double-quoted identifier.
NikhilKV
  • 48
  • 6

1 Answers1

2

You are right, DataFrameWriter.copy_into_location does not have the storage integration parameter.

You can create an external stage object pointing to your S3 location using your storage integration.

  create stage my_stage_s3
  storage_integration = my_storage_int
  url = 's3://mybucket/encrypted_files/'
  file_format = my_format;

Then, in your copy_into_location call, you specify the location as "@my_stage_s3/"

aek
  • 1,370
  • 2
  • 10
  • 14
  • It worked, thanks! One more quick question, I have provided the parameter "FILE_FORMAT=(TYPE=PARQUET)" during stage creation, also "file_format_type="parquet"" in my copy_into_location method call, but the file unloaded to S3 doesn't have any extension, just the name 'data'. Any thoughts on why there is still no extension. It was good(with extension .snappy.parquet) when I run with SQL "copy into" – NikhilKV Nov 30 '22 at 12:47
  • Strange, I could not reproduce the issue. My command as follows: `copy_result = df.write.copy_into_location("@my_stage_s3/data", file_format_type="parquet", header=True, overwrite=True)` The result: `data_0_0_0.snappy.parquet` – aek Nov 30 '22 at 13:45
  • Little weird. Unless I provide extension by df.write.copy_into_location("@my_stage_s3/data_test.snappy.parquet", file_format_type="parquet", header=True, overwrite=True, single=True, MAX_FILE_SIZE=200 * 1024 * 1024), it creates a file w/o any extension. Anyways thanks for quick responses. – NikhilKV Dec 01 '22 at 05:34