1

I'm trying to work with a custom activity in Data Factory to execute in a batch accounts pool a python batch stored in a blob storage.

I followed the Microsoft tutorial https://learn.microsoft.com/en-us/azure/batch/tutorial-run-python-batch-azure-data-factory

My problem is when I execute the ADF pipeline the activity failed: Activity

When I check in the Batch Explorer tool, I got this BlobAccessDenied message: BlobAccessDenied

Depending of the execution, it happens on all ADF reference files but also for my batch file.

I have linked the Storage Account to the Batch Accounts StorageAccount

I'm new to this and I'm not sure of what I must do to solve this.

Thank you in advance for your help.

Yohann V.
  • 56
  • 6
  • Hi @Yohann, did you paste your storage account connection string at line number 6 in main.py file? Also, you need to create Linked Service for your Storage and Batch accounts in ADF. These linked services are required when you configure your pipeline. – Utkarsh Pal Jun 29 '21 at 07:07

1 Answers1

0

I tried to reproduce the issue and it is working fine for me. Please check the following points while creating the pipeline.

  1. Check if you have pasted storage account connection string at line number 6 in main.py file
  2. You need to create a Blob Storage and a Batch Linked Services in the Azure Data Factory(ADF). These linked services will be required in “Azure Batch” and “Settings” Tabs when configure ADF Pipeline. Please follow below snapshots to create Linked Services.

In ADF Portal, click on left ‘Manage’ symbol and then click on +New to create Blob Storage linked service.

enter image description here

Search for “Azure Blob Storage” and then click on Continue

enter image description here

Fill the required details as per your Storage account, test the connection and then click on apply.

enter image description here

Similarly, search for Azure Batch Linked Service (under Compute tab).

enter image description here

Fill the details of your batch account, use the previously created Storage Linked service under “Storage linked service name” and then test the connection. Click on save.

enter image description here

Later, when you will create custom ADF pipeline, under “Azure Batch” Tab, provide the Batch Linked Service Name.

enter image description here

Under “Settings” Tab, provide the Storage Linked Service name and other required information. In "Folder Path", provide the Blob name where you have main.py and iris.csv files.

enter image description here

Once this is done, you can Validate, Debug, Publish and Trigger the pipeline. Pipeline should run successfully.

enter image description here

Once pipeline ran successfully, you will see the iris_setosa.csv file in your output Blob.

enter image description here

Utkarsh Pal
  • 4,079
  • 1
  • 5
  • 14
  • Hi, Thank you for your reply, I have already done all the step that you described. I think my problem is not the batch itself but the communication between the Storage Account and the Batch Accounts because the error tells that it cannot access the resources files at the start of the job. The Storage Account I have to use is in Private Access Level. Does this mean that I have to create an Identity for the Batch Account in the Azure AD to be able to add this Identity in the Access Control of the Storage Account? – Yohann V. Jun 29 '21 at 09:47
  • Yes, Azure Active Directory (Azure AD) authorizes access rights to secured resources through Azure role-based access control (Azure RBAC). You can refer https://learn.microsoft.com/en-us/azure/storage/common/storage-auth-aad to know more. – Utkarsh Pal Jun 29 '21 at 10:16
  • @YohannV. Did you manage to solve this issue? I'm currently facing something similiar. – Repcak Aug 04 '22 at 16:14