0

Multiple sources lead me to believe that using a del statement is rarely necessary. However, I am working on a program that needs to read in huge files ( 6 GB) with filename getting picked up from from a list, do some transformation, write them to a datastore and pick up the next file.

For instance the reference variables buffer and processed will get overwritten with every iteration of loop - is there a point to deleting them explicitly?

files = list() # contains 1000 filenames
for file in files:
    buffer = read_from_s3(file)
    processed = process_data(buffer)
    del buffer # needed?
    write_to_another_s3(processed)    
    del processed # needed?     
Deepak Gaur
  • 128
  • 7
  • 1
    Do the answers to this [question](https://stackoverflow.com/questions/1316767/how-can-i-explicitly-free-memory-in-python) help at all? – quamrana Jan 04 '23 at 21:15
  • 1
    My instinct says that you should attempt to release the memory taken by `buffer` immediately after `processed` has been created and also release `processed` immediately after `write_to_another_s3(processed)`. – quamrana Jan 04 '23 at 21:17
  • good point - if releasing the references matter, its better done right after the variable is done with. Edited the code - thanks! – Deepak Gaur Jan 04 '23 at 21:22
  • 2
    Explicit deletion of `buffer` or `processed` is not necessary, if a new value is assigned. Internal reference counters are used per variable. If the counter reaches zero, the variable if flagged to be unused and the next garbage collect will free its memory. Deletion or overwriting decrements the reference count. You may explicitly trigger the garbage collection: https://stackoverflow.com/a/1316793/11492317 Note: The for loop has no own scope, so if the loop ends the variables will be freed if the local scope will be left. – Andreas Jan 04 '23 at 21:28
  • Another point: if `process_data()` or `write_to_another_s3()` use the references (incrementing reference count) the object stays in memory, even if you invoke `del`. – Andreas Jan 04 '23 at 21:33
  • 4
    You are proposing to `del buffer` *before* calling `process_data(buffer)` in the next line? Of course that will give you an error. – kaya3 Jan 04 '23 at 21:35
  • Also related: [Would you prefer using del or reassigning to None?](https://stackoverflow.com/q/6693946/11082165), [Is there a difference between setting a variable to None or deleting it?](https://stackoverflow.com/q/36087458/11082165), and [Python del statement](https://stackoverflow.com/a/14969798/11082165) – Brian61354270 Jan 04 '23 at 21:46
  • 1
    So, it is **very, very** important to understand what `del` does. It does **not delete objects**. it *removes names from a namespace*. The `del` does nothing useful for every iteration except the last one, since as you point out, those names get assigned something new anyway (which is equivalent to `del`). However, without the `del`, those names stick around in the namespace after the loop is over. This is normally not a problem, because you are in some function, so those names should go out of scope (again, equivalent to del). However, if this happens in the global scope, that won't happen – juanpa.arrivillaga Jan 05 '23 at 00:15
  • *BUT* if you are doing this in a long-running program in the global scope, you have a serious problem in the way you've organized your code to begin with. – juanpa.arrivillaga Jan 05 '23 at 00:15

0 Answers0