0

My issue is I cant upload a BytesIO file directly to the blob. I have to save it first on my harddrive, reopen it and then upload it. The file comes from a data server via the requests library.

Using the same code block, I am able to download the file to my hard drive and then upload it to blob using the with open operators. However when I try to write it directly into the blob from the BytesIO file, the file appears with no data in it.

I think I am missing a method operation to make this happen. The first block of code is a direct upload which creates an empty file.

import BytesIO
from azure.storage.blob import BlobServiceClient

v_file = BytesIO()
for chunk in r.iter_content(chunk_size=1024 * 1024):
    if chunk:
        v_file.write(chunk)

container_name = 'test'
blob_name = 'test_blob'
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob_name)
blob_client.upload_blob(data=v_file, overwrite=True)

this block of code works, but I want to run this without downlaoding the file!

download_folder = 'C:/downloaders'
fqfilename = download_folder + "\\" + filename
with BytesIO() as v_file:
    for chunk in r.iter_content(chunk_size=1024 * 1024):
        if chunk:
            counter += len(chunk)
            size = counter / (1024)
            v_file.write(chunk)
            with open(fqfilename, 'wb') as f:
                f.write(v_file.getbuffer())

with open (fqfilename, 'rb') as file_to_blob:

    print(f' {file_to_blob} uploaded to the blob for date {date}')
    container_name = 'test'
    blob_name = 'test_blob'
    blob_service_client = BlobServiceClient.from_connection_string(connection_string)
    blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob_name)
    blob_client.upload_blob(data=v_file, overwrite=True)
    f.close()
  • probably answered here: https://stackoverflow.com/questions/26879981/writing-then-reading-in-memory-bytes-bytesio-gives-a-blank-result – Matthias Huschle Feb 07 '23 at 11:57

1 Answers1

0

So I manged to find the answer, its sort of in the linked posted above. Call the getvalue() method on the BytesIO file.

import BytesIO
    from azure.storage.blob import BlobServiceClient
    
    v_file = BytesIO()
    for chunk in r.iter_content(chunk_size=1024 * 1024):
        if chunk:
            v_file.write(chunk)
            file_to_blob = v_file.getvalue()
    
            container_name = 'test'
            blob_name = 'test_blob'
            blob_service_client = 
            BlobServiceClient.from_connection_string(connection_string)
            blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob_name)
            blob_client.upload_blob(data=file_to_blob, overwrite=True)
buddemat
  • 4,552
  • 14
  • 29
  • 49