0

I am very new in Openmp. I was summing up N integer numbers stored in an array and compiling the code using gfortran. Upto N=10^6, the results obtained from the serial and parallel codes are exactly same. For N=10^7, the serial code is running but, the parallel code (after compilation using -fopenmp flag) is giving "Segmentation fault". I have given my code here. Could anyone please help me why it is happening?

  use omp_lib
  REAL*8 r,summ,sl
  parameter (N=10000000)
  dimension r(N)

  do i=1,N
  r(i)=i
  enddo

  summ=0.0d00
  sl=0.0d00

  !$OMP PARALLEL FIRSTPRIVATE(sl) SHARED(r,summ)
  !$OMP DO 
  do i=1,N
  sl=sl+r(i)
  enddo
  !$OMP END DO
  !$OMP CRITICAL
  summ=summ+sl
  !$OMP END CRITICAL
  !$OMP END PARALLEL

  write(*,*)'SUM',summ

  end
  • Welcome, please take the [tour] and use tag [tag:fortran] for Fortran questions. – Vladimir F Героям слава Apr 05 '19 at 11:15
  • Possible duplicate of [Get the maximum value among OpenMP threads in Fortran](https://stackoverflow.com/questions/54735342/get-the-maximum-value-among-openmp-threads-in-fortran) – Vladimir F Героям слава Apr 05 '19 at 11:16
  • Please see the link, the error is the same, you must use `shared`, not `firstprivate`. Actually, you should remove `sl` altogether and just do the reduction with `summ`. – Vladimir F Героям слава Apr 05 '19 at 11:17
  • 1
    But the actual reason for the Segfault is here https://stackoverflow.com/questions/13264274/why-segmentation-fault-is-happening-in-this-openmp-code I suggest using the allocatable array, but you can also use the unlimited stack instead. – Vladimir F Героям слава Apr 05 '19 at 11:21
  • I don't see any issue with `firstprivate` and `shared`. Accumulation is done through a thread-local variable initialized to zero, then each threads sums atomically its private accumulator into the (shared) global accumulator. This is different from the other question (where the global acumulator was not shared). – Brice Apr 05 '19 at 12:59
  • I also suggest you always use Implicit None and forget you ever knew about Real*8 and learn the proper way to do it – Ian Bush Apr 05 '19 at 13:34
  • Using implicit none and also setting the larger stack size does not solve the problem. – Sumanta Kundu Apr 05 '19 at 14:56
  • I have also tried with the reduction operator and faced the same problem when the numbers are stored in the array. However, if I directly use sl=sl+i instead of r(i), it is giving the correct result whether I use the reduction or the way the code is written in the above. Here, r(i)=i but, my concern is r(i) is declared as SHARED and I need to use it when the r(i)=f(i), where f is some complicated function. The source of error is not understandable to me. – Sumanta Kundu Apr 05 '19 at 15:07
  • I have been able to reproduce the error, and in my case turning `r` into an **allocatable** array as suggested in the second (not the accepted) answer in the question liked to by @VladimirF worked. – chw21 Apr 06 '19 at 09:08
  • @Brice Doesn't really matter, the real duplicate is https://stackoverflow.com/questions/13264274/why-segmentation-fault-is-happening-in-this-openmp-code The first link just show hot to do reductions properly. – Vladimir F Героям слава Apr 06 '19 at 15:00
  • You're using firstprivate correctly, but the reduction operator is semantically clearer and likely more efficient. – Richard Aug 08 '19 at 14:51

1 Answers1

1

I have experienced the same problem before. The problem is that your code seems requiring a large memory.

Be sure that you use compiler option when you compile your code -mcmodel=medium. Also, when you use the -fopenmp your compiler calls systematically -frecursive that limit the size of your stack to a default value. Therefore, your code try to wrtie outside of the stack limitation that leads to a segmentation fault. To get rid of this problem you have to cancel the default limitation of the stack. One way to do this rapidely is to run on a terminal the command ulimit -s unlimited and then launch your code within the same terminal. You can also use compilation option -fmax-stack-var-size=n with the good value of n to set the size of the stack such that it fits your data.

Also, I suggest calculating your sum with using a reduction (+:sum) clause instead of declaring a cretical region that is ineficient and avoidable in this case.

I hope that this helps you.

Noureddine
  • 180
  • 1
  • 10