1

I want to get files from Azure Blob storage and use Matillion ETL tool to write them to Snowflake table. In Matillion, the default is coming as AWS. Very little documentation is available for Azure. My Matillion account is partner connect account through Snowflake.

I tried creating new project and selected Azure in that. Then in Azure I have created new app registration and taken the client ID and key from there. However, when I enter all this information in Matillion it shows - Blob Storage: Check Credentials.

I am attaching the snapshot for reference.

enter image description here

Please tell me how to connect Azure to Matillion.

Coder1990
  • 89
  • 8
  • Review if the steps listed here are done correctly: https://documentation.matillion.com/docs/2312291 – Srinath Menon Oct 28 '21 at 10:30
  • 1
    Would suggest to check if you have given the RBAC role in your storage account to the Service principle that you wish for Matillion ETL to have access to : https://documentation.matillion.com/docs/2954157 – RahulKumarShaw Oct 29 '21 at 12:27
  • @RahulKumarShaw-MT Thanks. That link really helped me. – Coder1990 Nov 02 '21 at 08:06

1 Answers1

2

I believe with Matillion ETL on Snowflake Partner Connect, you always get an AWS-hosted instance. So you won't find an Azure Blob Storage Load component like you would if you launched an Azure-hosted instance of Matillion ETL directly via Matillion themselves.

Instead you will need to follow two steps:

  1. Use a Data Transfer component to copy the file(s) from Azure Blob Storage into an AWS S3 bucket
  2. Use an S3 Load component to COPY the data from S3 into Snowflake

The Data Transfer component has to authenticate into Azure somehow, so there's one prerequisite (which it looks like you have started already)

Go to the Project / Manage Credentials dialog, and create a new Azure User-Defined Credential ...

enter image description here

... setting

  • Tenant ID: You can find from Azure Active Directory and look in Basic Information
  • Client ID: From your App Registration
  • Secret Key: From "Certificates & secrets" inside your App Registration
  • The Encryption Type refers to the way Matillion ETL stores the password. You can use Encoded, or KMS if you have a Master Key.

You must make sure that you get a "Blob Storage: success" message when pressing Test. It looks like that's the step you got stuck at.

The test will only pass if you have given Contributor access to at least one Storage Account. You can do this from the Azure console under Storage accounts / your storage account / Access Control (IAM) / Grant access to this resource / Add Role Assignment

Grant Contributor access to the App Registration identified by the above credentials.

After setting up the new Azure User-Defined Credential, in the Matillion ETL UI you then need to go to your Environment (bottom left) and set the Azure Credentials to your new Credentials, like this.

enter image description here

Once that is done, create a Data Transfer component

  • set the Source Type to Azure Blob Storage and press the browse button on the Blob Location property. It should list all the Storage Accounts to which the App Registration has been granted access (there is only one in the screenshot below)
  • find the blob you wish to load
  • set the Target Type to S3, and choose a Target Object Name and a Target URL

enter image description here

Run the Data Transfer component to copy the file from Azure storage into S3 storage. After that you will be able to use an S3 Load component to copy the data from S3 into Snowflake.

53epo
  • 784
  • 5
  • 7
  • Thank you. I hadn't granted contributor access to app registration which was giving me the issue. Can you tell me if there is a way to do data transfer without S3 ? – Coder1990 Nov 02 '21 at 08:07
  • Glad you got it fixed, it sounded like you were not far off. The Data Transfer component does support a wide range of sources and targets other than S3. But while you are using Matillion ETL via Snowflake Partner Connect, it's always AWS-hosted. So for bulk loading into Snowflake it will be easiest via an S3 Load because that's what the component does when it's AWS-hosted. You could always look at Storage Integrations on the Snowflake side maybe? – 53epo Nov 02 '21 at 19:41
  • Okay, I'll look into that. Thanks for your help. – Coder1990 Nov 03 '21 at 12:38