How to emulate Py_Main in pure Python

Question

I'm trying to write a Python program that emulates the behavior of Py_Main when running a Python script. In other words, I want to behavior similar to running python some_file.py from the command line, but inside of my Python script. In my case, I'm running inside a parallel programming environment and it is important to say in the same process (so I can use shared memory), so I can't just shell out to a new Python interpreter.

You can use execfile (or compile followed by exec in Python 3) to load the file itself. However, from some initial testing, I've found at least 4 differences between execfile and Py_Main:

execfile does not set __file__ correctly. By default it inherits the parent's globals, and therefore will see the parent's value of __file__. In order to work around this you have to create a new dictionary for globals and set __file__ yourself.
Similar to the above, execfile does not set __name__ by default. You would have to set it to something (e.g. __main__) yourself.
In fact, execfile does not create a new module at all. You can work around this by using imp.new_module() to create a fresh module, which you could use as the global scope for execfile.
Because execfile doesn't create a module, it doesn't add it to sys.modules. However, you have to be careful doing this: e.g. __main__ might already be in sys.modules if you're executing the top-level script as python top_level.py. I'm not sure there's a clean way around this except hiding the current value of __main__ and restoring it afterwards. You could use another value, but then you wouldn't be following the normal Python conventions associated with executing scripts.
execfile does not add the directory of the file to sys.path. As a result, you can't import another module in the same directory from inside the script, which is something that works with python some_script.py.

None of these are insurmountable problems, but I'm concerned this isn't a closed set, and that if I go this route I'll continue to find behaviors that I haven't modeled correctly in my solution.

My question:

Is there a way to implement this in Python such that I can be sure I'm correctly modeling all the behaviors of Py_Main?
Failing that, is there at least a C API I could use that would do this, considering that the Python interpreter may already be initialized (i.e. I cannot just call Py_Main). E.g. does PyRun_SimpleFile have the behavior I want? From the documentation, this is not at all obvious.
Or, in the worst case, is there at least documentation on what exactly Py_Main is doing, so that I can determine what I do or do not what to spend effort imitating in Python? The official documentation is pretty vague, and all of the above I determined by manually comparing with the behavior of execfile.

Edit: A note about runpy: After asking this question, I discovered the runpy.run_path function. At first glance, this looked like the solution I needed. However, it does not address my point #5 above. Specifically, it's possible to observe the difference between runpy.run_path("some_dir/test_execfile_helper.py") and python some_dir/test_execfile_helper.py in this scenario:

$ cat test_execfile.py
import runpy
runpy.run_path("some_dir/test_execfile_helper.py")

$ cat some_dir/test_execfile_helper.py
print("attempting to import some_other_file")
import some_other_file

$ cat some_dir/some_other_file.py
print("in some_other_file.py")

$ python some_dir/test_execfile_helper.py 
attempting to import some_other_file
in some_other_file.py

$ python test_execfile.py
attempting to import some_other_file
Traceback (most recent call last):
  File "test_execfile.py", line 2, in <module>
    runpy.run_path("some_dir/test_execfile_helper.py")
  File "/usr/lib/python2.7/runpy.py", line 252, in run_path
    return _run_module_code(code, init_globals, run_name, path_name)
  File "/usr/lib/python2.7/runpy.py", line 82, in _run_module_code
    mod_name, mod_fname, mod_loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "some_dir/test_execfile_helper.py", line 2, in <module>
    import some_other_file
ImportError: No module named some_other_file

So, perhaps this is a better starting point than execfile, but my original critique still stands: I still have no way of knowing if the list of issues listed above is a closed set.

How to emulate Py_Main in pure Python

0 Answers0