2

I have written a test code below which reproduces the out of memory I am getting in my program. I am constructing numpy arrays and creating DataFrame out of it. But I don't store the results nor the original arrays and delete the resulting dataframe. Instead of releasing the memory, it keeps adding up the memory eventually crashing the program. The returned dataframe is not referenced hence should be deleted upon gc.collect() but it doesn't.

Is there a way by which I can force python to release allocated memory during numpy array creation?

import numpy as np
import pandas as pd
import time
import gc

def test(size): 
    arr = np.random.randint(0, 10, size) 
    start = time.perf_counter() 
    df = pd.DataFrame(arr) 
    print("time elapsed {:6.6f}".format(time.perf_counter() - start)) 
    return df 

#create 1 billion integers
test(1000000000)
gc.collect()
test(1000000000)
gc.collect()
test(1000000000)
gc.collect()
Sanjit
  • 321
  • 2
  • 8
  • Have you already see this post? https://stackoverflow.com/questions/15455048/releasing-memory-in-python. To explicitly free up memory you can use ```import gc gc.collect()``` – Elidor00 Jan 25 '20 at 22:27
  • Also check this post: https://stackoverflow.com/questions/35316728/does-setting-numpy-arrays-to-none-free-memory – Elidor00 Jan 25 '20 at 22:35

0 Answers0