Update: If it helps narrow down the question for anyone, this question is really more about the CPython API and whether or not I'm missing some way to reach information that I need. I'm not asking for solutions to a broader problem, but rather in working on a broader problem I hit upon a specific question about CPython and whether or not it provided a way that was not obvious to me to obtain some specific information. I only tagged the question c because by its nature it requires some C expertise, but it is not a general question about C or specific architectures/platforms.
See also the note below about one possible approach using PyEval_SetTrace
, though I was hoping their might be a better way. As another example, there exists a PyMain_GetArgcArgv
which would do the trick here, but only if the Python interpreter were started from the python
executable rather than embedded (which might be an acceptable limitation). Also PyMain_GetArgcArgv
is not documented as part of the API.
I would like to be able to find the address of a C stack frame (i.e. the __builtin_frame_address(0)
as defined appropriately for that platform) that is most closely associated with a Python stack frame. In particular I'd like to find the outer-most frame--or close to it--associated with a Python function call, to be defined better below.
The context, to summarize, is that I'm wrapping a C library that uses an obscure custom-purpose garbage collector which needs a pointer to the bottom of the stack--at least as far back as there are local variables pointing to objects that should be tracked by the GC. Ideally I could mark the bottom of the stack once; in this case since it is being wrapped in a Python module it is sufficient to go down to the outer-most Python stack frame. The best available alternative would be to manually mark the stack bottom whenever entering calls to the library, but this is not ideal, and also would require patching to the library (which may be needed either way), as it currently only allows setting the stack bottom address once, during an initialization function.
How exactly a Python stack frame is associated with a C stack frame is ill-defined as it is, as there is technically no hard-and-fast connection between the two. However, for the practical purpose at hand it would be at or close to (depending on compiler optimizations, etc.) the PyEval_EvalFrameEx
call for the frame being executed (I'm not interested in frames that are not currently on the call stack since it's obviously a meaningless question in that case).
This is all obviously very CPython-specific and that's OK for my purposes. That being the case, there's no reason technically that the CPython PyFrameObject
struct implementation couldn't carry information like this on one of its members, but as far as I can tell there's nothing specifically stored on PyFrameObject
s that would allow me to associate it with a C stack frame. For example, my problem would be "solved" well-enough, for the purposes of this application, if there were something in PyFrameObject
like f_cstack
that were used like:
PyObject* _Py_HOT_FUNCTION
_PyEval_EvalFrameDefault(PyFrameObject *f, int throwflag)
{
...
f->f_executing = 1;
f->f_cstack = &f;
...
}
This would work AFAICT--even though f
is typically passed in a register, my gcc will handle code like this by pushing f
on the stack and storing its address on the stack. Unfortunately there is currently nothing like this I can find.
The best idea I've been able to come up with would be to register a PyEval_SetTrace
handler, which would be called upon entering Python stack frames and thus give me the opportunity to root around the stack from there. But really for the application at hand I only need to be able to find the "outer-most" PyEval_EvalFrameEx
call, which there will be one of for any running Python code. So installing a trace callback won't necessarily get me that, and it's additional overhead I don't need for every function call.
I fear there is not currently a good solution to this, though it would be handy if there were.
(P.S. I'm also only concerned about the main stack, and not threads, though any solution that would work on the main thread would likely have a similar solution on auxiliary threads).