10

I am writing an XS module. I allocate some resource (e.g. malloc() or SvREFCNT_inc()) then do some operations involving the Perl API, then free the resource. This is fine in normal C because C has no exceptions, but code using the Perl API may croak(), thus preventing normal cleanup and leaking the resources. It therefore seems impossible to write correct XS code except for fairly simple cases.

When I croak() myself I can clean up any resources allocated so far, but I may be calling functions that croak() directly which would sidestep any cleanup code I write.

Pseudo-code to illustrate my concern:

static void some_other_function(pTHX_ Data* d) {
  ...
  if (perhaps) croak("Could not frobnicate the data");
}

MODULE = Example  PACKAGE = Example

void
xs(UV n)
  CODE:
  {
    /* Allocate resources needed for this function */
    Data* object_graph;
    Newx(object_graph, 1, Data);
    Data_init(object_graph, n);

    /* Call functions which use the Perl API */
    some_other_function(aTHX_ object_graph);

    /* Clean up before returning.
     * Not run if above code croak()s!
     * Can this be put into the XS equivalent of a  "try...finally" block?
     */
    Data_destroy(object_graph);
    Safefree(object_graph);
  }

So how do I safely clean up resources in XS code? How can I register some destructor that is run when exceptions are thrown, or when I return from XS code back to Perl code?

My ideas and findings so far:

  • I can create a class that runs necessary cleanup in the destructor, then create a mortal SV containing an instance of this class. At some point in the future Perl will free that SV and run my destructor. However, this seems rather backwards, and there has to be a better way.

  • XSAWYERX's XS Fun booklet seems to discuss DESTROY methods at great length, but not the handling of exceptions that originate within XS code.

  • LEONT's Scope::OnExit module features XS code using SAVEDESTRUCTOR() and SAVEDESTRUCTOR_X() macros. These do not seem to be documented.

  • The Perl API lists save_destructor() and save_destructor_x() functions as public but undocumented.

  • Perl's scope.h header (included by perl.h) declares SAVEDESTRUCTOR(f,p) and SAVEDESTRUCTOR_X(f,p) macros, without any further explanation. Judging from context and the Scope::OnExit code, f is a function pointer and p a void pointer that will be passed to f. The _X version is for functions that are declared with the pTHX_ macro parameter.

Am I on the right track with this? Should I use these macros as appropriate? In which Perl version were they introduced? Is there any further guidance available on their use? When precisely are the destructors triggered? Presumably at a point related to the FREETMPS or LEAVE macros?

amon
  • 57,091
  • 2
  • 89
  • 149
  • Pardon my ignorance, but if you're not catching the `croak()`, won't everything be `free()`d as the program process(es) finish anyways? – stevieb Aug 20 '17 at 17:55
  • @stevieb I'm not catching the error, but other (Perl) code might, this is a library not an application. Yes, resources like allocated memory will be released by the OS when the process terminates. However: that leaked memory will still be allocated until then. That's not good, particularly for long-running processes like servers. There's a reason why most languages including Perl have some kind of garbage collector. And while memory is the resource I'm interested in releasing, there are other resources as well: e.g. a tempfile that should be deleted, or a mutex or lock that should be released. – amon Aug 20 '17 at 18:10
  • Fair enough, thank you for the clarification. – stevieb Aug 20 '17 at 18:22
  • 1
    Great question by the way... – stevieb Aug 20 '17 at 18:28
  • Is the library available publicly at this point? – stevieb Aug 20 '17 at 19:25
  • Re "*but I may be calling functions that croak() directly which would sidestep any cleanup code I write*", e.g. Fatal warnings. This is usually handled by mortalizing early. `SV* sv = sv_2mortal(newSV()); ... return sv;` instead of `SV* sv = newSV(); ... return sv_2mortal(sv);` – ikegami Aug 20 '17 at 19:32
  • @ikegami Yes, and I'm doing that for temporary SVs. But this only works for SVs (and similar Perl data structures). In my case, I'm `malloc()`ing or `Newx()`ing complex C data structures that need a cleanup function to run. One of my ideas is to store an object with suitable cleanup code in a mortal SV which does get the job done. It's just very involved and inconvenient and unlikely that there isn't a better way – which led me to the `SAVEDESTRUCTOR` macros. – amon Aug 20 '17 at 19:48
  • @stevieb No, the library is not available at this point. But as this is a hobby project, I intend to publish it to CPAN if and when it gets finished and reaches a presentable level of quality. The project is kind of an event loop, but with a focus on [Trampolines](https://en.wikipedia.org/wiki/Trampoline_(computing)#High-level_programming) to implement certain control flow – such as tail recursion and yield-style coroutines. All of that can be done in pure Perl, but the performance overhead is unappealing. – amon Aug 20 '17 at 20:00
  • 2
    The Perl API typically only croaks on invalid input. If you call the Perl API in a way that could croak, simply check the arguments for validity yourself. The only exception is calling into Perl code where you can use the [`G_EVAL flag`](https://perldoc.perl.org/perlcall.html#G_EVAL). – nwellnhof Aug 20 '17 at 20:05
  • @amon, That makes no sense to me. Why would you malloc or Newx stuff in XS code that's not owned by a Perl variable? – ikegami Aug 20 '17 at 22:39
  • 1
    @nwellnhof, You forgot about fatal warnings (which I mentioned above). It's not practical to check if you actually got a number instead of using `SvIV`. But that can warn (uninitialized, or not numeric), which can die. – ikegami Aug 20 '17 at 22:56
  • 1
    @ikegami I'm allocating temporary data structures (buffers, queues, graphs) that are needed during the execution of the XS function. They will not be returned as an SV. They are not bounded by some constant so can't be stack-allocated. I've added a piece of pseudo-code to illustrate the structure of my problem. – amon Aug 21 '17 at 08:07
  • (cc @ikegami) Now that I know what to look for, there's a large corpus of motivating examples: [`POSIX::sigaction` registers a destructor to reset a sigmask](https://git.io/v5JE6). [`threads::shared` registers a destructor to release a lock under all circumstances before the xsub returns](https://git.io/v5JEr). [The Perl MongoDB driver registers a destructor to free a C-level parser object](https://git.io/v5JEH), which is necessary as the parsing functions may croak. That's very similar to my use case. – amon Aug 21 '17 at 14:05
  • @ikegami Then simply don't enable fatal warnings in your XS module. – nwellnhof Aug 21 '17 at 16:23
  • @nwellnhof You suggest it is feasible to avoid exceptions by checking all preconditions. In the above comment I've linked a few examples where destructors are used. How would you have structured such code using better checks instead of registering destructor callbacks? Wouldn't that lead to much more complex code and duplicated code? I have read your CommonMark bindings, but there you have the luxury of a 1:1 resource to SV relationship, and comparatively thin bindings around a C library which allows you to defer most exceptions until POSTCALL. – amon Aug 21 '17 at 17:18
  • 1
    I didn't know about `SAVEDESTRUCTOR` and it seems like a bulletproof way to run cleanup code. I wouldn't use Perl exceptions to handle errors in my own internal C functions, though. Instead, I'd prefer to return error codes and make the XSUB throw after releasing all resources. This should be good enough for many typical cases, like extracting data from an `AV` or `HV`. Even if there's a way that users of your library can sneak in some weird data that makes a Perl API function throw an exception and cause a memory leak, I wouldn't be too worried unless it's absolutely mission critical code. – nwellnhof Aug 21 '17 at 22:38

1 Answers1

5

Upon further research, it turns out that SAVEDESTRUCTOR is in fact documented – in perlguts rather than perlapi. The exact semantics are documented there.

I therefore assume that SAVEDESTRUCTOR is supposed to be used as a "finally" block for cleanup, and is sufficiently safe and stable.

Excerpt from Localizing changes in perlguts, which discusses the equivalent to { local $foo; ... } blocks:

There is a way to achieve a similar task from C via Perl API: create a pseudo-block, and arrange for some changes to be automatically undone at the end of it, either explicit, or via a non-local exit (via die()). A block-like construct is created by a pair of ENTER/LEAVE macros (see Returning a Scalar in perlcall). Such a construct may be created specially for some important localized task, or an existing one (like boundaries of enclosing Perl subroutine/block, or an existing pair for freeing TMPs) may be used. (In the second case the overhead of additional localization must be almost negligible.) Note that any XSUB is automatically enclosed in an ENTER/LEAVE pair.

Inside such a pseudo-block the following service is available:

  • […]

  • SAVEDESTRUCTOR(DESTRUCTORFUNC_NOCONTEXT_t f, void *p)

    At the end of pseudo-block the function f is called with the only argument p.

  • SAVEDESTRUCTOR_X(DESTRUCTORFUNC_t f, void *p)

    At the end of pseudo-block the function f is called with the implicit context argument (if any), and p.

The section also lists a couple of specialized destructors, like SAVEFREESV(SV *sv) and SAVEMORTALIZESV(SV *sv) that may be more correct than a premature sv_2mortal() in some cases.

These macros have basically been available since effectively forever, at least Perl 5.6 or older.

amon
  • 57,091
  • 2
  • 89
  • 149