7

I need to read a JSON file from a blob container in Azure for doing some transformation on top of the JSON Files. I have seen few documentation and StackOverflow answers and developed a python code that will read the files from the blob.

I have tried the below script from one of the Stackoverflow answers to read JSON file but I get the below error

"TypeError: the JSON object must be str, bytes or byte array, not BytesIO"

I am new to python programming so not sure of the issue in the code. I tried with download_stream.content_as_text() but the file doesnt read the file without any error.

from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
from io import BytesIO
import requests
from pandas import json_normalize
import json

filename = "sample.json"

container_name="test"
constr = ""

blob_service_client = BlobServiceClient.from_connection_string(constr)
container_client=blob_service_client.get_container_client(container_name)
blob_client = container_client.get_blob_client(filename)
streamdownloader=blob_client.download_blob()

stream = BytesIO()
streamdownloader.download_to_stream(stream)
# with open(stream) as j:
#      contents = json.loads(j)
fileReader = json.loads(stream)

print(filereader)
dfbeg
  • 77
  • 1
  • 5

1 Answers1

9

You can use readallfunction. Please try this code:

from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
import json

filename = "sample.json"

container_name="test"
constr = ""

blob_service_client = BlobServiceClient.from_connection_string(constr)
container_client = blob_service_client.get_container_client(container_name)
blob_client = container_client.get_blob_client(filename)
streamdownloader = blob_client.download_blob()

fileReader = json.loads(streamdownloader.readall())
print(fileReader)

Result: enter image description here

Steve Johnson
  • 8,057
  • 1
  • 6
  • 17
  • is there any way to read multiple files based on a pattern using the above code. TIA. – SanjanaSanju Jul 13 '22 at 02:07
  • If the JSON is gzipped, use `import gzip` and `json.loads(gzip.decompress(streamdownloader.readall()))` to prevent `UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte`. ([Credits.](https://stackoverflow.com/a/61364994/812102)) – Skippy le Grand Gourou Mar 29 '23 at 13:47