0

I am running a C program that is paralellized with MPI using the Intel C compiler and at the end of a long calculation I have it print out a large table of data, but it crashes while printing a line of data, sometime in the middle of a number. I was hoping someone on here would be able to find out the issue. This is an example row print loop that sometimes causes the issue

out=fopen(output,"a");
int i, N=1200;
for(i=0;i<N;i++){
  fprintf(out," spin-%-5d ",i+1);
}

And this is how the data is being printed immediately following this

int t, points = 100;
for(t=0;t<points;t++){
    fprintf(out,"\n");
    for(i=0;i<N;i++){
        fprintf(out,"%9lf   ",array[i][t]);
    }

If it helps this is the error block I get:

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 5243 RUNNING AT condofree037
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
   Intel(R) MPI Library troubleshooting guide:
      https://software.intel.com/node/561764
===================================================================================

Edit* As some suggested I tried making a minimal reproduceable program and couldn't get it to fail. I then inspected my code and saw there were some fprintf commands outside of the if(node_number==0) block and so the other nodes were messing up the output, causing the fault. Thanks.

FPerras
  • 1
  • 1
  • 2
    How do you check if `fopen` succeeded? If it fails it will return NULL and voila... Do you close `out`? If not you will run out of file handles and some `fopen` will return NULL. – Grzegorz May 29 '20 at 21:24
  • 1
    I don't think `%lf` is a standard format specifier. – Nate Eldredge May 29 '20 at 21:24
  • Anyhow, you should create a [mcve]. The bug may be triggered by code you haven't shown us. – Nate Eldredge May 29 '20 at 21:25
  • Some other part of the program could have had undefined behavior. This often causes errors later in unrelated code. – Barmar May 29 '20 at 21:27

0 Answers0