1

I have a python script where I'm downloading a file into memory and storing it into a variable.

stream = downloadBlob(blob_name_file)

def downloadBlob(blob_name_file: str):
    # creating container client to access container
        container_str_url = secret_client.get_secret("testString").value
        container_client = ContainerClient.from_container_url(container_str_url)
    # blob client to access the specific blob
        blob_client = container_client.get_blob_client(blob= blob_name_file)
        # downloading the blob to memory in bytes
        stream_downloader = blob_client.download_blob()
        stream = BytesIO()
        stream_downloader.readinto(stream)
        return stream

After I have that I put it into a dataframe

try:
        processed_df = pd.read_parquet(stream, engine='pyarrow')
except Exception as e:
        print(e)

Once in the dataframe I go ahead and delete the stream variable to free up memory (or at least that's the goal)

del stream

The problem I'm facing is that I'm still using up too much memory. I'm using Azure to run my script and am limited to ~2.5gb RAM. It can run fine on one machine without memory issues, but when I push this up to run on even just 2 machines with one instance, I'm hitting memory cap at times. The part of the code that I've outlined in this post is what I assume would use the most memory, the rest of the script just passes the dataframe around basically. I even reduce the dataframes' size and del the original.

My question is, is there a better way of doing what I'm doing at all? And is del doing what I believe it is doing? Because I'll be honest, even after adding del to my script, it didn't seem to affect memory usage much, if at all.

BlakeB9
  • 345
  • 1
  • 3
  • 13
  • 1
    Does this answer your question? [Free memory in Python](https://stackoverflow.com/questions/47959077/free-memory-in-python) – Donnie May 16 '22 at 22:31
  • @Donnie There is definitely something there that I can try. ```gc.collect()``` after the ```del``` 's. I will look into it tomorrow, thanks for the suggestion – BlakeB9 May 16 '22 at 22:37

0 Answers0