0

I am facing an issue while trying to store the generated transcript file using the speech-to-text transcription batch API in Azure. I am using the destination_container_url parameter to specify the destination container where I want to store the file. However, I am unable to store the file in the predefined directory folder structure within the Azure Container.

I have already tried providing the destination_container_url parameter with the desired directory path, but the API seems to ignore the directory structure and stores the file in the root of the container instead.

Rumit dev
  • 3
  • 1
  • You can try passing entire Path to your transcript directory in your storage container with in destination_container_url parameter as https://.blob.core.windows.net//transcripts. – SiddheshDesai May 30 '23 at 13:05
  • I have already tried passing the desired directory but it stored the transcripted json file inside the automatically created transcription_id folder. @SiddheshDesai – Rumit dev May 30 '23 at 13:10
  • Can you try passing the path to container in the outputcontainerURL and add your desired output container in the request. Example- POST /transcriptions HTTP/1.1 Host: .cognitiveservices.azure.com Content-Type: application/json Ocp-Apim-Subscription-Key: { "contentUrls": [ "" ], "outputContainerUrl": "", "prefix": "" } – SiddheshDesai May 30 '23 at 13:13
  • I have tried passing the outputContainerUrl parms inside the Speech to text API. but now it is not created inside the container as it is inside the Microsoft default container.@SiddheshDesai – Rumit dev May 30 '23 at 13:28
  • Does the container exist in your specified OutputcontainerUrl parameter? – SiddheshDesai May 30 '23 at 13:34
  • Yes @SiddheshDesai – Rumit dev May 30 '23 at 13:38
  • Refer this point You can store the results of a batch transcription to a writable Azure Blob storage container using option destinationContainerUrl in the batch transcription creation request. Note however that this option is only using ad hoc SAS URI and doesn't support Trusted Azure services security mechanism. The Storage account resource of the destination container must allow all external traffic. in this Ms Document- https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/batch-transcription-create?pivots=rest-api#destination-container-url – SiddheshDesai May 30 '23 at 14:28
  • did you allow all public access to the container? – SiddheshDesai May 30 '23 at 14:29
  • Refer this storage authorization document- https://learn.microsoft.com/en-us/rest/api/storageservices/authorize-requests-to-azure-storage, You need to authorize with the desired container either by SAS URI or giving your speech resource blob contributor role on the storage account, But the SAS URI is the recommended method. also just for testing follow this sample github repo- https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/batch – SiddheshDesai May 30 '23 at 14:35
  • In the sample above the # properties.destination_container_url = "" the container url needs to be with SAS URL with atleast write permissions, Refer here - https://learn.microsoft.com/en-us/azure/cognitive-services/translator/document-translation/how-to-guides/create-sas-tokens?tabs=Containers to create SAS URL and pass the url to desired container with sas token appended https://i.imgur.com/X59FTlf.png copy this URI -https://i.imgur.com/EGaASFZ.png – SiddheshDesai May 30 '23 at 14:57
  • And use the above Blob sas url in the destination_container_url and try? – SiddheshDesai May 30 '23 at 14:58
  • Yes, I have already tried with the destination container and it stored results inside the defined container. but the problem is that I want to store the result inside a defined folder structure in the container. That I can not achieve. it is stored in the defined container with the automatically created folder name of the transcript id. @SiddheshDesai – Rumit dev May 31 '23 at 05:06
  • Create one folder in your Storage account and pass it in your destination url with SAS token like below:- https://storageaccountname.blob.core.windows.net//?sp=rw&st=2023-05-31T06:46:53Z&se=2023-05-31T14:46:53Z&spr=https&sv=2022-11-02&sr=b&sig=xYvk1yve6Kq5FYIaY3OwIff%2FghzSXN%2Ftqu5C9O7irGQ%3D After sp is SAS token generated at the blob level inside a folder – SiddheshDesai May 31 '23 at 06:49
  • Does the above url work? You can pass the SAS token generated at the container level there? – SiddheshDesai May 31 '23 at 12:53
  • I tried using this url https://storageaccountname.blob.core.windows.net/%3Ccontainer-name%3E/%3Cfolder-name%3E?sp=rw&st=2023-05-31T06:46:53Z&se=2023-05-31T14:46:53Z&spr=https&sv=2022-11-02&sr=b&sig=xYvk1yve6Kq5FYIaY3OwIff%2FghzSXN%2Ftqu5C9O7irGQ%3D with curl post method to save the results to Specific container directory, But it is not possible. Refer this SO thread answer by Gaurav Mantri- https://stackoverflow.com/questions/52420756/provide-access-to-a-folder-in-azure-blob-container the blob storage folder is a virtual folder not a real one thus its not possible – SiddheshDesai May 31 '23 at 16:01
  • For now the only option you have is to copy the results in container and then copy the container's result in specific folder directory by using az copy command- https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-files – SiddheshDesai May 31 '23 at 16:04
  • yes, know that it is my last solution. and here there is one problem I need to run the az command using the Python script. – Rumit dev Jun 01 '23 at 05:52
  • but is any Python SDK support that copies and moves the folder to the desired location? @SiddheshDesai – Rumit dev Jun 01 '23 at 05:53
  • here there is one problem I need to run the az command using the Python script- You can do it by referring to the answer here- https://stackoverflow.com/questions/73701627/login-to-python-script-using-service-principal but is any Python SDK support that copies and moves the folder to the desired location? > Yes, You can make use of the code in this SO thread answer by to copy files from one file or folder to another- https://stackoverflow.com/questions/73941246/azure-storage-account-how-to-rename-move-a-blob-within-a-container# – SiddheshDesai Jun 01 '23 at 06:24

1 Answers1

0

Posting my comments as an answer

I tried using below Batch Curl request with my Azure storage destination URL set to -

[https://storageaccountname.blob.core.windows.net/<container-name>/<folder-name>?sp=<SASTOKEN31T14:46:53Z&spr=https&sv=2022-11-02&sr=b&sig=xYvk1yve6Kq5FYIaY3OwIff%2FghzSXN%2Ftqu5C9O7irGQ%3D](https://storageaccountname.blob.core.windows.net/%3Ccontainer-name%3E/%3Cfolder-name%3E?sp=rw&st=2023-05-31T06:46:53Z&se=2023-05-31T14:46:53Z&spr=https&sv=2022-11-02&sr=b&sig=xYvk1yve6Kq5FYIaY3OwIff%2FghzSXN%2Ftqu5C9O7irGQ%3D>

But the transcription results did not get saved in the specific folder inside the container because according to the answer here by Gaurav Mantri, Blob folders/directories are virtual directories thus and the Batch Transcrption API does not have a property to add the transcription results to specific folder inside the Container. In the sample Batch Transcription python code here. The property is set to Container URL not container folder URL.

# properties.destination_container_url = "<SAS Uri with at least write (w) permissions for an Azure Storage blob container that results should be written to>"

API request referred from this Document-

curl -v -X POST -H "Ocp-Apim-Subscription-Key: YourSubscriptionKey" -H
"Content-Type: application/json" -d '{   "contentUrls": [
    "https://crbn.us/hello.wav",
    "https://crbn.us/whatstheweatherlike.wav"   ],   "locale": "en-US",   "displayName": "My Transcription",   "model": null,  
"properties": {
    "wordLevelTimestampsEnabled": true,
    "languageIdentification": {
      "candidateLocales": [
        "en-US", "de-DE", "es-ES"
      ],
    }
   },
  }'  "https://YourServiceRegion.api.cognitive.microsoft.com/speechtotext/v3.1/transcriptions"

API output:-

enter image description here

As an alternative, You can copy or move the transcript result file from your container to specific folder in another container or same container by using the code below:-

from azure.storage.blob import BlobServiceClient

source_container_name = "siliconcotainer/container"
source_blob_name = "result.json"
destination_container_name = "siliconcontainer2/folder"
destination_blob_name = "result2.json"

connection_string = "DefaultEndpointsProtocol=https;AccountName=storageaccountname;AccountKey=xxxxxxxxcxxxxxAStaktbOA==;EndpointSuffix=core.windows.net"
blob_service_client = BlobServiceClient.from_connection_string(connection_string)

source_blob_client = blob_service_client.get_blob_client(container=source_container_name, blob=source_blob_name)
destination_blob_client = blob_service_client.get_blob_client(container=destination_container_name, blob=destination_blob_name)

destination_blob_client.start_copy_from_url(source_blob_client.url)

enter image description here

In order to perform az login with Python SDK use the code below:-

Install the package below:-

pip install azure-cli
from azure.cli.core import get_default_cli

 

# Get the default Azure CLI instance

cli = get_default_cli()

 

# Run the az login --use-device-code command

device_code, url = cli.invoke(['login', '--use-device-code'])

 

# Display the device code and URL to the user

print("Device code:", device_code)

print("URL:", url)

Output:-

enter image description here

enter image description here

enter image description here

If you want to log in without device code but directly via browser use this code:-

device_code, url  =  cli.invoke(['login'])

Just remove , '--use-device-code'

Reference:-

azure - Login to python script using service principal - Stack Overflow By Jahnavi

python - Azure Storage Account: How to rename/move a Blob within a Container - Stack Overflow By SwethaKandikonda

As BlobService is unsupported, I have used BlobServiceClient in my code above.

SiddheshDesai
  • 3,668
  • 1
  • 2
  • 11