5

I prepared a C++ interface to a legacy Fortran library.

Some subroutines in the legacy library follow an ugly but usable status code convention to report errors, and I use such status codes to throw a readable exception from my C++ code: it works great.

On the other hand, sometimes the legacy library calls STOP (which terminates the program). And it often does it even though the condition is recoverable.

I would like to capture this STOP from within C++, and so far I have been unsuccessful.

The following code is simple, but exactly represents the problem at hand:

The Fortran legacy library fmodule.f90:

module fmodule
  use iso_c_binding
  contains
    subroutine fsub(x) bind(c, name="fsub")
      real(c_double) x
      if(x>=5) then 
         stop 'x >=5 : this kills the program'
      else
         print*, x
      end if
    end subroutine fsub    
end module fmodule

The C++ Interface main.cpp:

#include<iostream>

// prototype for the external Fortran subroutine
extern "C" {
  void fsub(double& x);  
}

int main() {  
  double x;
  while(std::cin >> x) {
    fsub(x);
  }
  return 0;
}

The compilation lines (GCC 4.8.1 / OS X 10.7.4; $ denotes command prompt ):

$ gfortran -o libfmodule.so fmodule.f90 -shared  -fPIC -Wall
$ g++ main.cpp -L. -lfmodule -std=c++11

The run:

$ ./a.out 
1
   1.0000000000000000     
2
   2.0000000000000000     
3
   3.0000000000000000     
4
   4.0000000000000000     
5
STOP x >=5 : this kills the program

How could I capture the STOP and, say, request another number. Notice that I do not want to touch the Fortran code.

What I have tried:

  • std::atexit: cannot "come back" from it once I have entered it
  • std::signal: STOP does not seem to throw a signal which I can capture
Escualo
  • 40,844
  • 23
  • 87
  • 135
  • I'm guessing this is hard -- STOP is meant to terminate the process. It'd be no different than C/C++ library that called `exit`. In this case, you might be able to hook the STOP call ahead of the FORTRAN runtime (if that's possible at all), but inherently the code you're working against is not written to behave as a library should. Fixing the FORTRAN library is likely a much less painful and easier-to-verify route. – Joe Oct 25 '13 at 17:58
  • It is not possible, you have to adjust the Fortran code, or hijack the run-time library calls the particular compiler uses for STOP. – Vladimir F Героям слава Oct 25 '13 at 18:00
  • 2
    And I see it is actually Fortran 90+. By ugly legacy FORTRAN code people usually think something different. – Vladimir F Героям слава Oct 25 '13 at 18:02
  • I really, really, really don't want to touch the Fortran code. But it seems almost inevitable... – Escualo Oct 25 '13 at 18:22
  • It is not entirely true that one cannot "come back" from an `atexit` handler - see my answer. – Hristo Iliev Oct 26 '13 at 14:54

4 Answers4

11

You can solve your problem by intercepting the call to the exit function from the Fortran runtime. See below. a.out is created with your code and the compilation lines you give.

Step 1. Figure out which function is called. Fire up gdb

$ gdb ./a.out
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
[...]
(gdb) break fsub
Breakpoint 1 at 0x400888
(gdb) run
Starting program: a.out 
5

Breakpoint 1, 0x00007ffff7dfc7e4 in fsub () from ./libfmodule.so
(gdb) step
Single stepping until exit from function fsub,
which has no line number information.
stop_string (string=0x7ffff7dfc8d8 "x >=5 : this kills the programfmodule.f90", len=30) at /usr/local/src/gcc-4.7.2/libgfortran/runtime/stop.c:67

So stop_string is called. We need to know to which symbol this function corresponds.

Step 2. Find the exact name of the stop_string function. It must be in one of the shared libraries.

$ ldd ./a.out 
    linux-vdso.so.1 =>  (0x00007fff54095000)
    libfmodule.so => ./libfmodule.so (0x00007fa31ab7d000)
    libstdc++.so.6 => /usr/local/gcc/4.7.2/lib64/libstdc++.so.6 (0x00007fa31a875000)
    libm.so.6 => /lib64/libm.so.6 (0x0000003da4000000)
    libgcc_s.so.1 => /usr/local/gcc/4.7.2/lib64/libgcc_s.so.1 (0x00007fa31a643000)
    libc.so.6 => /lib64/libc.so.6 (0x0000003da3c00000)
    libgfortran.so.3 => /usr/local/gcc/4.7.2/lib64/libgfortran.so.3 (0x00007fa31a32f000)
    libquadmath.so.0 => /usr/local/gcc/4.7.2/lib64/libquadmath.so.0 (0x00007fa31a0fa000)
    /lib64/ld-linux-x86-64.so.2 (0x0000003da3800000)

I found it in (no surprise) the fortran runtime.

$ readelf -s /usr/local/gcc/4.7.2/lib64/libgfortran.so.3|grep stop_string
  1121: 000000000001b320    63 FUNC    GLOBAL DEFAULT   11 _gfortran_stop_string@@GFORTRAN_1.0
  2417: 000000000001b320    63 FUNC    GLOBAL DEFAULT   11 _gfortran_stop_string

Step 3. Write a function that will replace that function

I look for the precise signature of the function in the source code (/usr/local/src/gcc-4.7.2/libgfortran/runtime/stop.c see gdb session)

$ cat my_exit.c 
#define _GNU_SOURCE
#include <stdio.h>

void _gfortran_stop_string (const char *string, int len)
{
        printf("Let's keep on");
}

Step 4. Compile a shared object exporting that symbol.

gcc -Wall -fPIC -c -o my_exit.o my_exit.c
gcc -shared -fPIC -Wl,-soname -Wl,libmy_exit.so -o libmy_exit.so my_exit.o

Step 5. Run the program with LD_PRELOAD so that our new function has precedence over the one form the runtime

$ LD_PRELOAD=./libmy_exit.so ./a.out 
1
   1.0000000000000000     
2
   2.0000000000000000     
3
   3.0000000000000000     
4
   4.0000000000000000     
5
Let's keep on   5.0000000000000000     
6
Let's keep on   6.0000000000000000     
7
Let's keep on   7.0000000000000000   

There you go.

damienfrancois
  • 52,978
  • 9
  • 96
  • 110
  • It was a fun challenge :) – damienfrancois Oct 25 '13 at 21:49
  • Wouldn't it be easier to come up with a cpp macro redefining the STOPs to calls of your function? – Vladimir F Героям слава Oct 25 '13 at 22:05
  • Nice. It should be noted: the FORTRAN code may or may not leave things in a good state where it normally would have `STOP`ped. @Arrieta: you will want to check that the routines you're calling don't assume prior state that may be now incomplete... – Joe Oct 25 '13 at 22:20
  • wow..! but really this strikes me as asking for a code maintenance / portability nightmare. – agentp Oct 26 '13 at 13:12
  • 4
    Steps 1 and 2 together take 5 seconds if you compile the Fortran code with `gfortran -c -fdump-tree-all fmodule.f90` and then take a look at `fmodule.f90.003t.original`: `_gfortran_stop_string (&"x >=5 : this kills the program"[1]{lb: 1 sz: 1}, 30);` More people should know about GCC syntax tree dumps - these come extraordinary useful in many cases. – Hristo Iliev Oct 26 '13 at 13:48
  • 3
    Note that what you do prevents `STOP` from stopping the subroutine and it will continue to execute. This could result in all kind of nasty things done by the code that follows: out-of-bound array access, numeric under-/overflows, etc. – Hristo Iliev Oct 26 '13 at 15:05
  • @VladimirF I totally disregarded solutions involving the compiler ; my solution starts with the compiled binary, but your trick is great. – damienfrancois Oct 26 '13 at 19:31
  • @HristoIliev Good to know. I'll remember this. – damienfrancois Oct 26 '13 at 19:31
5

Since what you want would result in non-portable code anyway, why not just subvert the exit mechanism using the obscure long jump mechanism:

#include<iostream>
#include<csetjmp>
#include<cstdlib>

// prototype for the external Fortran subroutine
extern "C" {
  void fsub(double* x);  
}

volatile bool please_dont_exit = false;
std::jmp_buf jenv;

static void my_exit_handler() {
  if (please_dont_exit) {
    std::cout << "But not yet!\n";
    // Re-register ourself
    std::atexit(my_exit_handler);
    longjmp(jenv, 1);
  }
}

void wrapped_fsub(double& x) {
  please_dont_stop = true;
  if (!setjmp(jenv)) {
    fsub(&x);
  }
  please_dont_stop = false;
}

int main() {
  std::atexit(my_exit_handler);  
  double x;
  while(std::cin >> x) {
    wrapped_fsub(x);
  }
  return 0;
}

Calling longjmp jumps right in the middle of the line with the setjmp call and setjmp returns the value passed as the second argument of longjmp. Otherwise setjmp returns 0. Sample output (OS X 10.7.4, GCC 4.7.1):

$ ./a.out 
2
   2.0000000000000000     
6
STOP x >=5 : this kills the program
But not yet!
7
STOP x >=5 : this kills the program
But not yet!
4
   4.0000000000000000
^D     
$

No library preloading required (which anyway is a bit more involved on OS X than on Linux). A word of warning though - exit handlers are called in reverse order of their registration. One should be careful that no other exit handlers are registered after my_exit_handler.

Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186
  • This is very clever. And potable. Let me play with it. – Escualo Oct 26 '13 at 15:37
  • @Arrieta, mind that this way the stack of `fsub` is not unwound properly (as `STOP` is supposed to terminate the whole program), which means e.g. that local allocatable arrays are not deallocated. You might end up with memory leaks. – Hristo Iliev Oct 26 '13 at 16:01
1

Combining the two answers that use a custom _gfortran_stop_string function and longjmp, I thought that raising an exception inside the custom function would be similar, then catch in in the main code. So this came out:

main.cpp:

#include<iostream>

// prototype for the external Fortran subroutine
extern "C" {
  void fsub(double& x);  
}

int main() {  
  double x;
  while(std::cin >> x) {
    try { fsub(x); }
    catch (int rc) { std::cout << "Fortran stopped with rc = " << rc <<std::endl; }
  }
  return 0;
}

catch.cpp:

extern "C" {
    void _gfortran_stop_string (const char*, int);
}

void _gfortran_stop_string (const char *string, int len)
{
        throw 666;
}

Then, compiling:

gfortran -c fmodule.f90
g++ -c catch.cpp
g++ main.cpp fmodule.o catch.o -lgfortran

Running:

./a.out
2
   2.0000000000000000     
3
   3.0000000000000000     
5
Fortran stopped with rc = 666
6
Fortran stopped with rc = 666
2
   2.0000000000000000     
3
   3.0000000000000000     
^D

So, seems to work :)

steabert
  • 6,540
  • 2
  • 26
  • 32
0

I suggest you fork your process before calling the fortran code and exit 0 (edit: if STOP exits with zero, you will need a sentinel exit code, clanky but does the job) after the fortran execution. That way every fortran call will finish in the same way: the same as if it had stopped. Or, if "STOP" ensure an error, throw the exception when the fortran code stops and send some other message when the fortran execution "completes" normaly.

Below is an example inspire from you code assuming a fortran "STOP" is an error.

 int main() {  
   double x;
   pid_t pid;
   int   exit_code_normal = //some value that is different from all STOP exit code values
   while(std::cin >> x) {
     pid = fork();
     if(pid < 0) {
       // error with the fork handle appropriately
     } else if(pid == 0) {
       fsub(x);
       exit(exit_code_normal);
     } else {
       wait(&status);
       if(status != exit_code_normal)
          // throw your error message.
     }
   }
   return 0;
 }

The exit code could be a constant instead of a variable. I don't think it matters much.

Following a comment, it occurs that the result from the execution would be lost if it sits in the memory of the process (rather than, say, write to a file). If it is the case, I can think of 3 possibilities:

  • The fortran code messes a whole lot of memory during the call and letting the execution continue beyond the STOP is probably not a good idea in the first place.
  • The fortran code simply return some value (through it's argument if my fortran is not too rusty) and this could be relayed back to the parent easily through a shared memory space.
  • The execution of the fortran subroutine acts on an external system (ex: writes to a file) and no return values are expected.

In the 3rd case, my solution above works as is. I prefer it over some other suggested solution mainly because: 1) you don't have to ensure the build process is properly maintained 2) fortran "STOP" still behave as expected and 3) it requires very few lines of code and all the "fortran STOP workaround" logic sits in one single place. So in terms of long term maintenance, I much prefer that.

In the 2nd case, my code above needs small modification but still holds the advantages enumerated above at the price of minimal added complexity.

In the 1st case, you will have to mess with the fortran code no matter what.

Sebastien
  • 1,439
  • 14
  • 27
  • 2
    This is clever, but what if `fsub` modifies a global state or returns a value? The parent process won't see the changes made by the child because of the copy-on-write memory mapping. – Hristo Iliev Oct 26 '13 at 16:08
  • Excellent comment. That could become hellishly complicated if the fortran code is non-reentrant or if the return values are hard to fetch from the parent. – Sebastien Oct 26 '13 at 18:07