0

I have these loops here where I calculate array 'tab'. I tried using the openmp reduction but it doesn't work. I get a seg. fault with OMP_NUM_THREADS greater than 1.

What is it I'm doing wrong?

Regards.

!$OMP PARALLEL DO DEFAULT(SHARED) &
!$OMP PRIVATE(ii,ix_tab,iy_tab,ipx,ipy,iw,C) &
!$OMP REDUCTION(+:tab)
do ii = 1,N

    ix_tab = ...
    iy_tab = ...

    do ipy = -npy_max,npy_max
        do ipx = -npx_max,npx_max
            do iw = 1, M
                C = Fx(iw,ipx,ix_tab) * Fy(iw,ipy,iy_tab)

                tab(iw,ipx,ipy) = tab(iw,ipx,ipy) + A(iw,ii) * C
            enddo
        enddo
    enddo

enddo
!$OMP END PARALLEL DO

--- EDIT ---

Ok, here is my solution:

allocate wrk(nt,-npx_max:npx_max,-npy_max:npy_max,nthreads)

!$OMP PARALLEL DO DEFAULT(SHARED) &
!$OMP PRIVATE(tid,ii,ix_tab,iy_tab,ipx,ipy,iw,C)

tid = OMP_GET_THREAD_NUM() + 1

!$OMP DO
do ii = 1,N

    ix_tab = ...
    iy_tab = ...

    do ipy = -npy_max,npy_max
        do ipx = -npx_max,npx_max
            do iw = 1, M
                C = Fx(iw,ipx,ix_tab) * Fy(iw,ipy,iy_tab)

                wrk(iw,ipx,ipy,tid) = wrk(iw,ipx,ipy,tid) + A(iw,ii) * C
            enddo
        enddo
    enddo

enddo
!$OMP END DO
!$OMP END PARALLEL

do  tid = 1, nthreads
    tab(:,:,:) = tab(:,:,:) + wrk(:,:,:,tid)
enddo

deallocate(wrk)

Can it be done better? faster?

Regards.

Fuji San
  • 135
  • 8
  • Your stack is probably overflowing... You should consider implementing the reduction yourself! – Alexander Vogt Apr 10 '15 at 09:43
  • Simply make `tab` an `ALLOCATABLE` array. Otherwise `REDUCTION` makes multiple private copies of it and many compilers tend to place those private copies on the stack even when they are bigger than the threshold for automatic heap allocation. – Hristo Iliev Apr 10 '15 at 10:46
  • @VladimirF: The original code in the other question contained data races that probably resulted in out-of-bound array access. If I read the timeline correctly, he applied the `reduction` clause before fixing the data race issue. In any case, the OpenMP standard states that `reduction` works as `private` when it comes to data allocation, i.e. it should create allocatable private copies of allocatable list items. – Hristo Iliev Apr 11 '15 at 07:42
  • Ok, I accept ths. But as there is no.race condition in that question now you could have just answeered it eith a better answer, now they are exact dupkicates IMHO. And I am pretty sure there are others in the history. – Vladimir F Героям слава Apr 11 '15 at 07:51
  • Related/possible duplicates: http://stackoverflow.com/questions/13558318/openmp-segmentation-fault-11 , http://stackoverflow.com/questions/13870564/gfortran-openmp-segmentation-fault-occurs-on-basic-do-loop – Vladimir F Героям слава Apr 11 '15 at 08:19
  • I considered carefully the similarity. Still, there is no indication that in the current state of the code the OP of the other question is still getting segmentation fault when the `reduction` clause is used. But the culprit in the second of the two questions you suggest is the same. – Hristo Iliev Apr 11 '15 at 22:27

0 Answers0