I've written a script that reads many types of files and writes file-by-file to an Excel sheet. This happens in a for loop. The problem is that when I finish writing the data to disk, the allocated memory is still occupied though I'm using del
and gc.collect()
. I've used the memory profiler, and it was clear that the memory is not being freed up when the variable data is deleted. What is the reason?
These are some parts of the memory profile output of two files:
First file:
Line # Mem usage Increment Occurrences Line Contents
=============================================================
167 121.7 MiB 121.7 MiB 1 @profile
168 def read_file(f_path: str):
...
178 121.7 MiB 0.0 MiB 1 file_output = ""
...
202 121.7 MiB 0.0 MiB 1 elif ext == ".pdf":
203 330.7 MiB 209.0 MiB 1 file_output = read_pdf(f_path)
...
212 330.7 MiB 0.0 MiB 1 return file_output
Second file:
202 331.7 MiB 0.0 MiB 1 elif ext == ".pdf":
203 478.4 MiB 146.7 MiB 1 file_output = read_pdf(f_path)
gc.collect:
147 478.4 MiB 0.0 MiB 2 output_csv_path = os.path.join(output_path, f"data_csv_temp{p}.csv")
148 478.4 MiB 0.1 MiB 2 df.to_csv(output_csv_path, index=False)
149 478.4 MiB 0.0 MiB 2 del file_output
150 478.4 MiB 0.0 MiB 2 del df
151 478.4 MiB 0.0 MiB 2 gc.collect()
Thanks in advance!