1

This will be more of a theoretical question. I have implemented a cache for my "special" needs. This cache holds handles to subprocesses that would run indefinitely if not stopped through a message sent to them through unix named pipes. My cache uses an active time based eviction policy using a single background daemon thread.

Now the problem comes up when the main program terminates and the cache still has entries. I thought I'd use atexit on a staticmethod within my Cache class, but by the time atexit invokes my cleanup function the entries are freed making me unable to shut down the associated subprocess.

class LinkedList(object):
    # Standard doubly linked-list implementation with a twist: get(Node) will make the retrieved node the header.
    pass

zero = Node(0)
one = Node(1)
two = Node(2)
l = LinkedList()
l.prepend(zero)
l.prepend(one)
l.prepend(two)
l.print() # Prints the list from tail to head
0
1
2
l.get(one)
l.print()
0
2
1

My cache is a class based decorator but I'd rather not go into too much detail:

class Cache(object):
    list = LinkedList()

    def borrow(*args, **kwargs):
        # check capacity, evict tail if necessary is omitted
        list.prepend(node)
        return node

    @staticmethod
    def cleanup():
        Cache.list.print()

atexit.register(Cache.cleanup)

In the logs I see the following with a capacity of 3:

Caching: 0
Cached(0): 0
Caching: 1
Cached(1): 1, 0
Caching: 2
Cached(2): 2, 1, 0
Caching: 3
Evicting(0): 3, 2, 1
Cached(3): 3, 2, 1
Caching(4):
Evicting(1): 4, 3, 2
Cached(4): 4, 3, 2
Cleanup:

During cleanup my list becomes empty, my nodes are gone. Seems like during exit the atexit handler runs after the objects that I need are already freed up and my handles for the subprocesses are gone, meaning I have no way of terminating them. Is there something similar to atexit.register that runs on exit while I still have my objects?

@Update: As per my main application is flask:

app = flask.Flask(__name__)
try:
  # read configuration
  # configure loggers
  app.run(host=util.get_hostname(), port=config.get_port(), debug=config.is_debug, processes=1, threaded=True)
except:
  # handle exceptions

The rest is the same as originally provided. One special note is that within my environment hosts are restarted automatically each Monday. Every running application gets a SIGTERM, then given ~2minutes to shut down before getting a SIGKILL.

Display name
  • 637
  • 1
  • 7
  • 16
  • The safe way to do this is to make your cleanup code part of a context manager. I.e. `with Cache() as cache: main(cache)` – juanpa.arrivillaga Nov 03 '20 at 09:45
  • As to your specific question, though, I would not have expected `atexit` to work that way, can you provide a [mcve] of this behavior? – juanpa.arrivillaga Nov 03 '20 at 09:47
  • yeah... context manager does not work. My main is a completely separate module, and Cache is a class based decorator with kwargs read from configuration file during startup. I'd love to provide a minimal sample but with minimal samples I couldn't reproduce this behavior, thus the question. The docs does not say anything about it. To put it simple: is it possible that atexit runs AFTER objects has already been freed? – Display name Nov 03 '20 at 10:10
  • I mean, those don't really preclude you from using a context manager. In any case, I suspect something else is going on though. – juanpa.arrivillaga Nov 03 '20 at 10:13
  • 1
    you're absolutely right... I found the issue. It was caused by Flask, as in debug mode 2 processes are started. The first one has the atexit handler registered, and in the 2nd the actual Cache created with entries. Upon shutdown the atexit handler is triggered, but since it's in a different process there are no entries. Turning off debug mode solves this problem just to have atexit handler not triggered at all. – Display name Nov 03 '20 at 11:57
  • Might be worth tagging this with `flask` and elaborating a bit more on your situation in your question then writing up an answer for it, I suspect you won't be the last to be bitten by it. You can even accept your own answer, nothing wrong with that – juanpa.arrivillaga Nov 03 '20 at 12:05

1 Answers1

1

Turns out the problem was caused by Flask's debug mode. The main script is a Flask app started using the debug=True kwarg which in turn starts 2 processes. The first process is used to for atexit.register, while the later is used for processing requests. My Cache entries resided in the 2nd process, while during exit the atexit handler correctly shown that my cache is entry as it was triggered from the main process. Removing debug=True from the flask app partially resolves this issue due to a single process being started, but atexit did not get triggered, due to SIGINT flag being used to shut it down. So I needed

signal.signal(SIGINT, Cache.cleanup)
signal.signal(SIGTERM, Cache.cleanup)
atexit.register(Cache.cleanup)

to cover all cases in my environment. Now the cache works correctly and the subprocesses are terminated.

Display name
  • 637
  • 1
  • 7
  • 16
  • Why do you need to manually clean up an in-memory structure at interpreter exit anyway? That memory will be cleaned by the OS. – AKX Nov 03 '20 at 13:08
  • "cache holds handles to subprocesses", basically I have a bunch of independent processes running for which I need to send a "quit" message in order for that to exit. – Display name Nov 05 '20 at 08:39
  • Sounds like a job for https://stackoverflow.com/questions/284325/how-to-make-child-process-die-after-parent-exits/284443 – AKX Nov 05 '20 at 10:27