I don't understand why this code :
double precision :: array(200,200,100) double precision :: array2(200,200,100) !$OMP BARRIER !$OMP DO SCHEDULE(static) do z=1,100 do y=1,200 do x=1,200 array2(x,y,z)=array(x,y,z) enddo enddo enddo !$OMP END DO NOWAIT !$OMP BARRIER
is much faster (~25% with 32 threads compiled with ifort) than this one :
double precision :: array(200,200,100) double precision :: array2(200,200,100) !$OMP BARRIER !$OMP WORKSHARE array2=array !$OMP END WORKSHARE NOWAIT !$OMP BARRIER
Those two codes are suppose to do exactly the same things.
Edit : oups, I made a mistake renaming my arrays, sorry
Edit2 : Sorry I didn't search enough before posting. I found my answer here Parallelizing fortran 2008 `do concurrent` systematically, possibly with openmp
Usage of OpenMP workshare directive is currently discouraged. It turns out that at least Intel Fortran Compiler and GCC serialise FORALL statements and constructs inside OpenMP workshare directives by surrounding them with OpenMP single directive during compilation which brings no speedup whatsoever. Other compilers might implement it differently but it's better to avoid its usage if portable performance is to be achieved.