9

I wanted a quick and dirty way to get some file names without typing in my shell, so I have this following piece of code:

from tkinter.filedialog import askopenfile

file = askopenfile()

Now this all works fine, but it does create a superfluous tkinter GUI that needs to be closed. I know I can do this to suppress it:

import tkinter as tk
tk.Tk().withdraw()    

But it doesn't mean it's not loaded on the back. It just means now there's a Tk() object that I can't close/destroy.


So this brought me to my real question.

It seems each time I create a Tk(), regardless if I del or destroy() it, the memory isn't freed up. See below:

import tkinter as tk
import os, psutil
process = psutil.Process(os.getpid())
def mem(): print(f'{process.memory_info().rss:,}')

# initial memory usage
mem()

# 21,475,328
for i in range(20):
    root.append(tk.Tk())
    root[-1].destroy()
    mem()

# 24,952,832
# 26,251,264
# ...
# 47,591,424
# 48,865,280

# try deleting the root instead

del root
mem()

# 50,819,072

As seen, python doesn't free up the usage even after every instance of Tk() is destroyed and roots deleted. This however isn't the case for other objects:

class Foo():
    def __init__(self):
        # create a list that takes up approximately the same size as a Tk() on average
        self.lst = list(range(11500))    

for i in range(20):
    root.append(Foo())
    del root[-1]
    mem()

# 52,162,560
# 52,162,560
# ...
# 52,162,560

So my question is, why is it different between Tk() and my Foo(), and why doesn't destroying/deleting the Tk() created free up the memory taken up?

Is there something obvious I've missed? Is my test inadequate to confirm my suspicion? I've searched here and Google but found little answers.

Edit: Below are a few other methods I've tried (and failed) with the recommendations in the comments:

# Force garbage collection
import gc
gc.collect()

# quit() method
root.quit()

# delete the entire tkinter reference
del tk
Community
  • 1
  • 1
r.ook
  • 13,466
  • 2
  • 22
  • 39
  • I think the Garbage Collections runs on its own terms. You can try to force GC after you destroy your tk instance. See if that helps. – Mike - SMT Oct 16 '18 at 15:38
  • The module `tkinter` itself could be holding references to the objects. – chepner Oct 16 '18 at 15:41
  • Nope, `gc.collect()` or `del tk` did nothing to reduce the memory footprint. – r.ook Oct 16 '18 at 15:41
  • Also, `root[-1].destroy()` is not the same as `del root[-1]`. (Although I would expect the garbage collector to clean everything up soon after you run `del root`.) – chepner Oct 16 '18 at 15:42
  • 3
    Take a look at [this post](https://stackoverflow.com/a/1316799/7475225). Alex Martelli points out that python uses something called `free list` and this can cause the memory issue you are seeing. There appears to be a way to work around this by using a subprocess so look into that if it is a big issue for you. – Mike - SMT Oct 16 '18 at 15:45
  • 4
    Calling `Tk()` does *far, far more* than just creating a window you might not want; it's loading and initializing an entirely separate programming environment, the Tcl interpreter that actually implements all of the GUI functionality. An implicit reference is kept to this interpreter, so that the functions in the `Tkinter` module can actually do their work. – jasonharper Oct 16 '18 at 15:46
  • @chepner I tried to force garbage collections after and the memory usage was still an issue. I believe the problem is due to `free list` in python. – Mike - SMT Oct 16 '18 at 15:46
  • @jasonharper good insight, however I would have expected `root.destroy()` would have also cleaned up the external interpreter if that's the case. At the very least, I wouldn't have expected *multiple* interpreter would be created each time I call `Tk()`, given that the memory increment is loosely consistent between each `Tk()` call. I'm not trying to rag on `tkinter`, but not having a way to implicitly/explicitly murder the interpreter upon finishing the GUI feels disappointing. – r.ook Oct 16 '18 at 15:54
  • 1
    @Mike-SMT It's not a "problem", per se. True, Python isn't returning the memory to the operating system, but it's available for future use in lieu of requesting more memory from the OS. – chepner Oct 16 '18 at 15:55
  • @Idlehands I think `destroy()` gets rid of the object and `quit()` is suppose to end the interpreter for tk. That said I also tested `quit()` with the same results on memory. – Mike - SMT Oct 16 '18 at 15:56
  • @Mike-SMT unfortunately `quit()` doesn't show a memory decrease as well. – r.ook Oct 16 '18 at 15:59
  • I think the solution is to run tkinter in a subprocess and then you can end that process to manage the memory issue. I would write up an example but I am still reading up on how it works. Never use subprocess before. – Mike - SMT Oct 16 '18 at 16:01
  • @Mike-SMT Luckily I'm not limited by resources to reach that point yet, but I would imagine the workaround with `subprocess` wouldn't be pretty either. I just find it odd that a standard module of this magnitude doesn't have a way to optimize the memory implicitly or explicitly, and believed I must be missing something. – r.ook Oct 16 '18 at 16:04
  • @Idlehands well it may seam odd but I am sure the developers of Python have a good reason for using `free list`. I am sure its benefit outweighs the memory cost. – Mike - SMT Oct 16 '18 at 16:06
  • @chepner I wonder how true that is though. In my shell I actually created the `Tk()`s first, which incremented to about 50MB usage after trying to delete/destroy everything. Subsequently when I tried my `Foo()` run, the memory usage just increased over 50MB instead of using the presumed "unreturned" memory. Python was requesting more mem from OS in lieu of the mem taken up by `Tk()`s. – r.ook Oct 16 '18 at 16:10
  • 2
    The "standard usage" of Tkinter is that the GUI lasts for the entire lifetime of the application (and, in fact, *is* the application), so there's no point in being able to destroy it earlier. It's entirely possible that the Tcl interpreter doesn't even implement a way to cleanly shut itself down prior to process termination, since that would be utterly pointless in a native Tcl application. – jasonharper Oct 16 '18 at 16:56
  • The proper way to use tkinter is to create a single root window at the start of the program, and let it live until the end of the program. What's the point of creating multiple root windows? – Bryan Oakley Oct 16 '18 at 18:42
  • @BryanOakley, in most cases yes, but at the very beginning of my question, I wanted a quick and dirty way to get file name with a GUI instead of typing text every time in the shell. But I noticed each time I call `askopenfile()` the memory increases and doesn't go back down, hence this investigation. You're right for most common usage though. – r.ook Oct 16 '18 at 18:48

2 Answers2

7

There are three issues here, one of which is tkinter's fault, one of which is yours, and one of which is behaving as intended.

The three issues are:

  1. tkinter creates an undetectable reference cycle as part of registering its cleanup handlers, which is only broken by explicitly calling destroy (if you don't do so, the reference cycle is never cleaned, and the resources are held forever)
  2. You're holding on to your Tk objects even after you destroy them
  3. The small object heap is rarely, if ever, returned to the OS before program termination (the memory is kept around for future allocations)

Problem #1 means you must destroy any Tk you create explicitly if there is any chance of recovering the memory.

Problem #2 means that you must explicitly get rid of any reference to a Tk (after destroying it) before creating a new one if you want the memory to be available for other purposes. In some cases, you'd also want to explicitly set tk.NoDefaultRoot() to prevent the first Tk you create from being cached on tkinter as the default root (that said, explicit calls to destroy on such an object will clear the cached default root, so this isn't going to be a problem in many cases).

Issue #3 means you must get rid of the references eagerly, rather than waiting until the end of the program to delete your root list; if you wait until the end to delete it, yes, the memory will be returned to the heap, but not to the OS, so it will look like you're still using all of it. It's not a real problem though; the unused memory will be paged out to disk if the OS is in need of RAM (it usually pages idle pages before active ones), and keeping it around improves the performance of most code.

Specifically, it looks like the .tk attribute of Tk instances isn't being cleaned up even when you explicitly destroy the Tk instance. You can cap the memory growth by changing your loop to get rid of the last reference to the Tk object, or if you just want to free the low level C resources, explicitly unlink .tk after destroying the new Tk element**:

# Not necessary, but avoids caching any Tk as a root when you don't want it
tk.NoDefaultRoot()  

root = []  # Missing in your original code, but I'm assuming it was a plain list
for i in range(20):
    root.append(tk.Tk())
    root[-1].destroy()

    # Either drop the reference to the `Tk` completely:
    root[-1] = None
    # or just drop the reference to its C level worker object
    root[-1].tk = None

    # Optionally, call gc.collect() here to forcibly reclaim memory faster
    # otherwise you're likely to see memory usage grow by a few KB as uncleaned
    # cycles aren't reclaimed in time so we see phantom leaks (that would
    # eventually be cleaned)
    mem()

Explicitly clearing the reference allows the underlying resources to be cleaned, based on the output from my slightly modified script:

12,152,832
17,539,072
17,924,096  # At this point, the original code was above 18.8M bytes
17,965,056
17,965,056  # At this point, the original code was above 21.7M bytes
... remains unchanged until end of program if gc.collect() called regularly ...

The fact that the memory is never completely reclaimed for the first object isn't surprising. Memory allocators rarely bother to actually return the memory to the operating system unless the allocation was huge (large enough to trigger a mode switch that makes an independent request to the OS for memory that is managed separately from the "small object heap"). Otherwise, they maintain a free list of memory that is no longer in use and can be reused.

The ~6 MB of "waste" here was likely a bunch of small allocations involved in creating the Tk object itself and the tree of objects it manages, that, while subsequently returned to the heap for reuse, will not be returned to the OS until the program exits (that said, if that part of the heap is never used again, the OS may preferentially page the unused parts out to disk if it runs low on memory). You can see how this optimization helped by noticing that the memory use stabilizes almost immediately; the new tk.Tk() objects are just reusing the same memory as the first ones (the lack of complete stability is likely due heap fragmentation causing a need for small additional allocations).

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • Part of the extra memory is going to maintaining an ever-growing list of references to dead tkinter instances. If you really intend to reclaim all memory, why keep references around in the list? Why not remove them reference from the list? I would think that would trigger any extra garbage collection on objects such as the .tk reference. – Bryan Oakley Oct 16 '18 at 18:36
  • Nice effort. With your insight I also tried `root[-1].__dict__.clear()` to see if any other object is hogging the memory, and there doesn't seem to be any improvement than simply `del root[-1].tk`. While it doesn't exactly solve the problem (and a weird jump of memory still occurs around the 8~15th iteration out of 20) it does seem to identify a major culprit. If no other answers come about I'll be happy to accept this answer. – r.ook Oct 16 '18 at 18:39
  • You seem to be going through more trouble than necessary. Why not just call `root.pop()`? – Bryan Oakley Oct 16 '18 at 18:40
  • @BryanOakley that would work in this specific test, but if I `root` was a single instance of `Tk()` I would have to call `del root.tk`, no? – r.ook Oct 16 '18 at 18:43
  • @idlehands: I'm not entirely sure what you're asking. When you delete `root`, all attributes of `root` will also be deleted. here shouldn't be any need to manually delete `root.tk` since `root` itself will no longer exist. – Bryan Oakley Oct 16 '18 at 18:48
  • @BryanOakley, In my original question I did `del root` and the memory still wasn't released. Your comment prompted me to try again however, and I notice there *is* a difference in `del root[-1]` during the loop (releases some memory) versus `del root` after everything. I suppose `del` is not recursive and only delete the surface layer of the `list`. Interesting. – r.ook Oct 16 '18 at 18:51
  • @BryanOakley: My original answer was slightly off (updated one seems correct), but it wasn't as simple as deleting `root` or its elements alone; if `destroy` isn't called, the reference cycle hidden in the cleanup handlers keeps the various `Tk` elements alive forever, even if every outside reference is severed. – ShadowRanger Oct 16 '18 at 22:18
  • @ShadowRanger: yes. I should have been more clear. You definitely need to call `destroy` on the window in addition to deleting the reference. – Bryan Oakley Oct 16 '18 at 22:32
4

When you create an instance of Tk, you are creating more than just a widget. You are creating an object that has several attributes (an embedded tcl interpreter, a list of widgets, etc). When you do root.destroy(), you're only destroying some of the data owned by that object. The object itself still exists and takes up memory. Since you keep a reference to that object in a list, that object never gets garbage-collected so the memory hangs around.

When you create a root window with root = tk.Tk(), you get back an object (root). If you look at the attributes of that object with vars, you see the following:

>>> root = tk.Tk()
>>> vars(root)
{'children': {}, '_tkloaded': 1, 'master': None, '_tclCommands': ['tkerror', 'exit', '4463962184destroy'], 'tk': <_tkinter.tkapp object at 0x10a1d7f30>}

When you call root.destroy(), you are only destroying the widget itself (essentially, the elements in the _tclCommands list). The other parts of the object remain intact.

>>> root.destroy()
>>> vars(root)
{'children': {}, '_tkloaded': 1, 'master': None, '_tclCommands': None, 'tk': <_tkinter.tkapp object at 0x10a1d7f30>}

Notice how _tclCommands has been set to None, but the rest of the attributes are still taking up memory. One of those, tk takes up a fair amount of memory that never gets reclaimed.

To completely remove the object, you need to delete it. In your case you need to remove the item from the list so that there are no longer any references to the object. You can then wait for the garbage collector to work it's magic, or you can explicitly call the garbage collector.

This may not reclaim 100% of the memory, but it should get you pretty close.


All that being said, tkinter wasn't designed to be used this way. The underlying expectation is that you create a single instance of Tk at the start of your program, and keep that single instance alive until your program exits.

In your case I recommend you create the root window once at the start of the program, and hide it. You can then call askopenfile() as often as you like throughout your program. If you want something more general-purpose, create a function that creates the root window the first time it is called and caches the window so that it only has to create it once.

Bryan Oakley
  • 370,779
  • 53
  • 539
  • 685