Context
I am developing a 2D animation system where the data is organized as a node-based tree structure with semantics and functionality very similar to XML DOM -- but not quite similar enough to just use an existing XML implementation. For the sake of clarity, let's simplify and assume that the data structure is conceptually defined via the following pseudo-code:
class Node {
Node* firstChild;
Node* nextSibling;
Node* parent;
}
Following state-of-the-art guidelines of modern C++ (Core Guidelines, Herb Sutter talk, etc.), an appropriate choice would be to use smart pointers for owning handles, and raw pointers for non-owning handles. Since ownership is unique, then std::unique_ptr
makes sense:
class Node {
std::unique_ptr<Node> firstChild_;
std::unique_ptr<Node> nextSibling_;
Node* parent_;
public:
Node() : parent_(nullptr) {}
static std::unique_ptr<Node> make() { return std::make_unique<Node>(); }
Node* firstChild() const { return firstChild_.get(); }
Node* nextSibling() const { return nextSibling_.get(); }
Node* parent() const { return parent_; }
Node* appendChild(std::unique_ptr<Node> child) { ... }
std::unique_ptr<Node> removeChild(Node* child) { ... }
Node* makeChild() { return appendChild(make()); }
}
This API assumes that client code knows what it is doing. Like when using a list::iterator
, clients need to read the doc to know which operations may invalidate nodes, and be careful not to blindly hold a Node*
as a data member and dereference it later, since it may be dangling. I am on board with this approach for C++ clients: I believe this is idiomatic C++ and is the price you pay for performance.
However, I absolutely need to provide more safety for Python clients. The animation software is a C++ GUI program with an embedded Python console, which means that many Python clients are expected to be artists with very little programming experience, mostly copy-pasting code from tutorials and adapting it to fit their needs. To make things worse, such unreliable non-tested Python code might be executed while the user is in a long-lasting interactive session with potentially valuable unsaved data. The software should absolutely not crash when a Python client tries to use an expired node. Expected behavior is something along the lines of:
[ Embedded Python Console ]
>>> node = getSelectedNode() # allocated from C++ and selected in the GUI
>>> print(node)
<Node at 0x14e4c30 with 42 child>
>>> node.parent
<Node at 0x14e4d24 with 12 child>
>>> deleteSelection() # or via GUI interaction
>>> print(node)
<Invalid node: the node has already been deleted>
>>> print(node.parent)
InvalidNodeError: the node has already been deleted
>>>
Questions
How would you wrap this C++ API using pybind11 in order to get the expected behavior?
If necessary, how would you change the C++ API to allow for such behavior, and/or make the wrapper code more readable, idiomatic, etc.?
What I've already tried or considered
Naively wrapping as class_<Node>(m, "Node")
does not work: the Python instance cannot know whether the node is still valid or not. Basically, returning Node*
via take_ownership
results in double deletion, and returning them either via reference
or reference_internal
results in undefined behavior (read, segmentation faults).
So we either need some PyNode
trampoline class tracking the validity of nodes via other means (e.g., register for callbacks such as Node::onAboutToDie()
), or changing the C++ ownership model to use ref-counted smart pointers.
What I believed was a good option was to change the C++ code to use shared_ptr
instead of unique_ptr
. I would have Node
derive from enable_shared_from_this
, and wrap it as class_<Node, PyWeakPtr<Node>>(m, "Node")
where PyWeakPtr
is a custom holder storing a weak_ptr
under the hood. Since the nodes derive from enable_shared_from_this
, the holder could have a PyWeakPtr(T*)
constructor converting the raw pointer into a weak_ptr
. In a nutshell, this would leverage the ref-counting ability of shared pointers not to have shared ownership (ownership is still unique), but just to be able to use weak references on Python side, which would check expired()
before every access, throwing a catchable exception if expired. However, so far, my various attempts failed: either I couldn't manage to get the expected behavior, or I couldn't even get them to compile, etc.
A more obvious solution also based on shared_ptr
is to use class_<Node, std::shared_ptr<Node>>(m, "Node")
instead of our custom holder PyWeakPtr
, but it does not work either because it would artificially extend the lifetime of nodes as long as some python variable points to them. I consider this a leak: nodes are not supposed to be shared, and if the user deletes a node in the UI, it should be gone and the destructor immediately called. The semantics really should be that accessing a "semantically-deleted" node in Python raises a Python error (without crashing the C++ program), not silently masquerading as a valid node.
If useful, I might take the time to clean/make_minimal/etc some of these attempts, but for now I'll leave the question as is for conciseness, Maybe some expert out here already have some useful insight, such as whether the whole approach even makes sense, etc. :)
Note that many existing software solving similar problems such as Qt's QDomDocument
or Pixar's Universal Scene Description tend to use ref-counted weak references (more or less hidden internally) even in the C++ API, totally shielding from mistake C++ and Python clients alike. I'm open to this approach as well, although ideally, I think I'd prefer to stick with the general guideline of simply using non-owning raw pointers in C++.