8

In Brief

How can I monkey patch module A from module B when module A's functions are supposed to be importable so that I can run the module A's functions with the multiprocessing standard library package?

Background

A client requested a hotfix that will not be applicable to any of our other clients, so I created a new branch and wrote a separate module just for them to make merging in changes from the master branch easy. To maintain the client's backward compatibility with pre-hotfix behavior, I implemented the hotfix as a configurable setting in the app. Thus I didn't want to replace my old code -- just patch it when the setting was turned on. I did this by monkey patching.

Code Structure

The __main__ module reads in the configuration file. If the configuration turns on the hotfix's switch, __main__ patches my engine module by replacing a couple of functions with code defined in the hotfix module -- in essence, the function being replaced is the key function to a maximization function. The engine module later loads up a pool of multiprocessing workers.

The Problem

Once a multiprocessing worker gets started, the first thing multiprocessing does it re-imports* the engine module and looks for the key function that __main__ had tried to replace (then multiprocessing hands over control to my code and the maximization algorithm begins). Since engine is being re-imported by a brand new process and the new process does not re-run __main__ (where the configuration file gets read) because that would cause an infinite loop, it doesn't know to re-monkey-patch engine.

The Question

How can I maintain modularity in my code (i.e., keeping the hotfix code in a separate module) and still take advantage of Python's multiprocessing package?

* Note my code has to work on Windows (for my client) and Unix (for my sanity...)

Community
  • 1
  • 1
wkschwartz
  • 3,817
  • 2
  • 29
  • 33

3 Answers3

1

This sounds like a place where monkey-patching just won't work. It's easier to just extract the functions in question to a separate modules and have engine import them from there. Perhaps you can have a configuration setting of where to import them from.

Another way to modularize this is to use some sort of component architecture, like ZCA. That last option is what I would go with, but that's because I'm used to it, so there is no extra learning for me.

Lennart Regebro
  • 167,292
  • 41
  • 224
  • 251
1

To make it work on a UNIX/Linux OS which has fork(), you don't need to do anything special since the new process has access to the same (monkey-patched) classes as the parent.

To make it work on Windows, have your __main__ module read the configuration on import (put the read_config/patch_engine call at global scope), but do the multiprocessing (engine execution) in a if __name__ == '__main__' guard.

Then, the read-config code will be performed whenever __main__ is imported (either from the command-line or from a multiprocessing reimport), but the if __name__ == '__main__' code is performed only when your script is invoked from the command line (since __main__ is reimported under a different name in the child process).

nneonneo
  • 171,345
  • 36
  • 312
  • 383
0

Sounds like you are going to have to modify engine.py to check a configuration file, and have it patch itself if it's needed.

To work on both unix and Windows engine can keep a global CONFIG_DONE variable to decide if it needs to check again for the configuration file.

Ethan Furman
  • 63,992
  • 20
  • 159
  • 237