2

The Api receive the file, than tries to create an unique blob name. Than I upload in chunks of 4MB to the blob. Each chunk takes something about 8 seconds, is this normal? My upload speed is 110Mbps. I tried uploading a 50MB file and it took almost 2 minutes. I don't know if the azure_blob_storage version is related to this, I'm using azure-storage-blob==12.14.1

import uuid
import os
from azure.storage.blob import BlobClient, BlobBlock, BlobServiceClient
import time
import uuid

@catalog_api.route("/catalog", methods=['POST'])
def catalog():
    file = request.files['file']

    url_bucket, file_name, file_type = upload_to_blob(file)


def upload_to_blob(self, file):
    file_name = file.filename
    file_type = file.content_type

    blob_client = self.generate_blob_client(file_name)
    blob_url = self.upload_chunks(blob_client, file)
    return blob_url, file_name, file_type


def generate_blob_client(self, file_name: str):
    blob_service_client = BlobServiceClient.from_connection_string(self.connection_string)
    container_client = blob_service_client.get_container_client(self.container_name)

    for _ in range(self.max_blob_name_tries):
        blob_name = self.generate_blob_name(file_name)

        blob_client = container_client.get_blob_client(blob_name)
        if not blob_client.exists():
            return blob_client
    raise Exception("Couldnt create the blob")


def upload_chunks(self, blob_client: BlobClient, file):
    block_list=[]
    chunk_size = self.chunk_size
    while True:
        read_data = file.read(chunk_size)

        if not read_data:
            print("uploaded")
            break

        print("uploading")
        blk_id = str(uuid.uuid4())
        blob_client.stage_block(block_id=blk_id,data=read_data) 
        block_list.append(BlobBlock(block_id=blk_id))

    blob_client.commit_block_list(block_list)

    return blob_client.url
    ```
davidism
  • 121,510
  • 29
  • 395
  • 339

1 Answers1

1

I tried in my environment and got below results:

I tried with 50 mb file to upload blob storage account with chunk size of 4*1024*1024 from local environment to storage account it takes 45 secs.

Code:

import uuid
from azure.storage.blob import BlobBlock, BlobServiceClient
import  time


connection_string="<storage account connection string >"
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
container_client = blob_service_client.get_container_client('test')
blob_client = container_client.get_blob_client("file.pdf")
start=time.time()
#upload data
block_list=[]
chunk_size=4*1024*1024
with  open("C:\\file.pdf",'rb') as  f:
  while  True:
        read_data = f.read(chunk_size)
        if  not  read_data:
            break  # done
        blk_id = str(uuid.uuid4())
        blob_client.stage_block(block_id=blk_id,data=read_data)
        block_list.append(BlobBlock(block_id=blk_id))

blob_client.commit_block_list(block_list)
end=time.time()
print("Time taken to upload blob:", end - start, "secs")

In the above code, I added the timing method of both start and end at end of code I used the end-start process to know the timing of uploaded file in blob storage.

Console:

enter image description here

Make sure your internet speed is good and also, I tried with some other internet speed it takes maximum 78secs.

Portal:

enter image description here

Venkatesan
  • 3,748
  • 1
  • 3
  • 15