I am trying to run a script to extract some information from a pandas
dataframe and save images containing a subset of such dataframe, together with an image read from a folder.
The script runs fine for a number of images smaller than 700. For greater amounts of images, the scripts gets killed due to Out of Memory (detected through dmesg
command on Linux terminal after "Killed" exit message from python process).
The script looks like this (I tried closing figure and deleting variables at the end of the loop, but it didn't help):
import os
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from matplotlib import gridspec
import pandas as pd
import numpy as np
# READ CSV ~20k lines, not an issue for memory
df = pd.read_csv('20220221-1516_export.csv')
images = os.listdir('./slices') # list of images contained in the folder
gs = gridspec.GridSpec(2, 3, width_ratios=[1, 3, 1], height_ratios=[3.5, 1])
for image_name in images:
image_df = df[df.filename == image_name][['label', 'grader1', 'grader2', 'grader3', 'grader4', 'grader5']]
image_df = image_df[~image_df.filter(like='grader').apply(set, axis=1).isin([{False, np.nan}, {False}])].set_index('label')
fig = plt.figure(figsize=(10, 7))
ax0 = fig.add_subplot(gs[1])
img = mpimg.imread(os.path.join('slices', image_name))
ax0.imshow(img, cmap='gray')
ax1 = fig.add_subplot(gs[3:6])
ax1.axis('tight')
ax1.axis('off')
colors = image_df.applymap(lambda x: '#9BCA3E' if x == True else ('#ED5314' if x == False else '#C5C7D8'))
ax1.table(cellText=image_df.values, colLabels=image_df.columns, rowLabels=image_df.index, loc='center', cellColours=colors.values)
fig.tight_layout()
fig.savefig(os.path.join('output', image_name))
# I tried to close figures and delete variables but nothing changes
plt.close(fig)
del image_df, fig, ax0, ax1, colors, img
The csv file is composed of 20 thousand lines similar, below a part of it:
,filename,label,grader1,grader2,grader3,grader4,grader5
0,98c0c8fe7f17477da6620054936871cd.png,label1,,False,False,,False
1,98c0c8fe7f17477da6620054936871cd.png,label2,,False,False,,False
2,98c0c8fe7f17477da6620054936871cd.png,label3,,False,False,,False
3,98c0c8fe7f17477da6620054936871cd.png,label4,,False,False,,False
4,98c0c8fe7f17477da6620054936871cd.png,label5,,False,False,,False
5,98c0c8fe7f17477da6620054936871cd.png,label8,,False,False,,False
6,98c0c8fe7f17477da6620054936871cd.png,label9,,False,False,,False
7,98c0c8fe7f17477da6620054936871cd.png,label10,,False,False,,False
...
14,e369b623efbe4fae8efaf5d61d47b7cd.png,label8,False,False,,,False
15,e369b623efbe4fae8efaf5d61d47b7cd.png,label9,False,False,,,False
16,e369b623efbe4fae8efaf5d61d47b7cd.png,label10,False,False,,,False
From my understanding, the memory should not increase with the number of images, as everything (variables and files) get deleted/closed at the end of each loop. What am I doing wrong and what could be the cause of the Out of Memory issue?
Following posts did not help with my issue: