1

I'm not sure if this question belongs here or not. But I am having a problem with my code with PETSc saying that there is a floating point error. It is similar to the problem discussed in the links below:

http://lists.mcs.anl.gov/pipermail/petsc-users/2012-November/015858.html https://www.mail-archive.com/petsc-users@mcs.anl.gov/msg22930.html

Some people in those threads just said to ust "fp_trap". But where am I supposed to enter that? I tried to go into gdb and valgrind and then enter "fp_trap", but it's not working

user4352158
  • 731
  • 4
  • 13
  • 24

1 Answers1

0

Here is a piece of code that needs some debugging... This program prints norm is -nan.

static char help[] = "Floating point exception.\n\n";

#include <petscvec.h>

#undef __FUNCT__
#define __FUNCT__ "ShouldTriggerFloatingPointException"
extern PetscErrorCode ShouldTriggerFloatingPointException(){

    PetscErrorCode     ierr;

    PetscFunctionBegin;

    PetscScalar a=0.0;
    PetscScalar b=0.0;
    PetscScalar c=a/b;
    Vec x;

    ierr=VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);
    ierr= VecSetSizes(x,PETSC_DECIDE,1000);CHKERRQ(ierr);
    ierr=  VecSetFromOptions(x);CHKERRQ(ierr);

    ierr=VecScale(x,c);CHKERRQ(ierr);
    PetscScalar norm;
    ierr=VecNorm(x,NORM_2,&norm);CHKERRQ(ierr);

    PetscPrintf(PETSC_COMM_WORLD,"norm is %g\n",norm);

    PetscFunctionReturn(0); 
}


#undef __FUNCT__
#define __FUNCT__ "main"
int main(int argc,char **argv)
{

    PetscErrorCode ierr;
    PetscInitialize(&argc,&argv,(char*)0,help);
    ierr=ShouldTriggerFloatingPointException();CHKERRQ(ierr);
    ierr = PetscFinalize();
    return 0;
}

The following makefile is used to build the executable main https://scicomp.stackexchange.com/questions/905/compiling-and-running-a-hello-world-program-in-petsc

include ${PETSC_DIR}/conf/variables
include ${PETSC_DIR}/conf/rules

main: main.o  chkopts
    -${CLINKER} -o main main.o  ${PETSC_LIB}
    ${RM} main.o

The option -fp_trap of Petsc helps detecting the origin of this issue :

mpirun -np 2 main -fp_trap

will print a warning with the name of the function where the problem was detected :

[0]PETSC ERROR: [0] ShouldTriggerFloatingPointException line 14 main.c
[0]PETSC ERROR: ------- Error Message----------
[0]PETSC ERROR: Floating point exception!
[0]PETSC ERROR: trapped floating point error!

Hence -fp_trap is an option of the petsc program, nothing to do with the debugger.

Using a debugger is required to find the exact line where the issue occured. For instance, with gdb (and petsc configure --with-debugging) :

gdb --args  main -fp_trap
(gdb) run

args states that Arguments after executable-file are passed to the executable. It produces :

Program received signal SIGFPE, Arithmetic exception.
0x00000000004011a2 in ShouldTriggerFloatingPointException () at main.c:18
18  PetscScalar c=a/b;

The issue is now detected !

Further information about debugging and mpi : How do I debug an MPI program?

Community
  • 1
  • 1
francis
  • 9,525
  • 2
  • 25
  • 41
  • Trying `gdb --args main -fp_trap`, I got `Unknown option: -fp_trap Program exited normally.` – user4352158 Jan 27 '15 at 22:05
  • What are the outputs of `main -fp_trap` and `gdb main` ? The code i posted works with petsc 3.4.4 : i did not tested it with petsc 3.5 : what is your version of Petsc ? – francis Jan 27 '15 at 22:11
  • i have just tested it with petsc-3.5.2 : it seems to work fine. Have you compiled petsc with debugging ? It can be check by running `./main -log_summary` It should print something like `Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 ${COPTFLAGS} ${CFLAGS}` The `-g3` should mean that debugging is on. – francis Jan 27 '15 at 22:28
  • I'm not sure where I'm supposed to enter `log_summary`. I am using scientific linux. In the directory `user/project/Build`, after I ran 'make' to compile and link all the cpp files,I had no problems. But then, when I went to directory `user/run/run.sh`, which runs the project binary in `user/project/Build/bin/project`, I get the floating point exception error. In the directory `user/run`, I enter `gdb` in the command prompt and get the message ` *** No targets specified and no makefile found. Stop` – user4352158 Jan 28 '15 at 23:38
  • It seems that your question is related to `gdb`. As explained is the question http://stackoverflow.com/questions/27992719/using-gdb-to-detect-segmentation-fault-in-sh , `gdb` would produce a result if you run it with the executable, not the `.sh` script. Could you try something like `gdb --args ../project/Build/bin/project/name-of-executable -fp_trap -log_summary` in folder `user/run` ? The options are for the executable, that is for the program build with petsc, not for `gdb`. This is the reason with the option `--args` is used for `gdb`. Type `gdb --help` to learn more about `gdb`. – francis Jan 29 '15 at 15:52
  • If a prompt like `(gdb)` appears, type `run` and press enter. Oh ! I have just seen that http://stackoverflow.com/questions/27992719/using-gdb-to-detect-segmentation-fault-in-sh was one of your questions ! – francis Jan 29 '15 at 15:56