4

I have read this article: Parallel Programming in Fortran 95 using OpenMP Where it reads on pages 11 and 12 that :

real(8) :: A(1000), B(1000) 
! $OMP PARALLEL DO
do i = 1, 1000 
   B(i) = 10 * i 
   A(i) = A(i) + B(i) 
enddo
! $OMP END PARALLEL DO

Might not work since the matrix B's values are not ensured until ! $OMP END (PARALLEL) DO. To me this is crucial. I have some loops with a lot of statements that depend on previous statements in a do loop and I thought this would be natural. I get that B(j) cannot be ensured in iteration i given that i/=j but in the same iteration I thought it was as a given. Am I correct or have I misunderstood? If it is this way, is there a command to ensure that at least within the iteration the values of variables are updated for each statement before the next?

I have tried some simple loops that seems to be working, just as if it was serial code, but I have some other code where it seems a bit more random : works with /O3 but not /O0, the code is quite large and a bit hard to read so I won't post it here...)

Erik Thysell
  • 1,160
  • 1
  • 14
  • 31

2 Answers2

4

It looks very strange. If it was like that most of the code that you will see that uses OpenMP would be non-conforming. You will see things like this all over my codebase and I believe that the claim is bogus. Unfortunately there is no direct citation of the relevant piece of the specification there and it is hard to search what had in mind.

I would even say that features like atomic and the critical sections would loose their sense if it was as the author claims.

Without seeing the code that is random for you, we can't say anything, better maybe not mention it at all if you do not plan to show it.

  • Thanks Vladimir F, that makes me hopeful ☺ – Erik Thysell Jul 03 '15 at 20:47
  • I've written code like the example many times, and I have never had trouble with it. The references to `omp flush` indicates that this is about internal thread-local storage such as CPU registers that may not be synchronized with main memory. But that isn't a problem as long as only one thread reads and writes from each variable, such as is the case in that example. This question may be relevant: https://stackoverflow.com/questions/19687233/explict-flush-direcitve-with-openmp-when-is-it-necessary-and-when-is-it-helpful – amaurea Jul 04 '15 at 00:27
  • Had a look at the original articles and it's not as well expressed as it could be. It's all connected with the OpenMP memory model which can be confusing, but the essence here is that if a thread worked on iteration i it will get up to date versions of variables associated with that iteration, but no other thread is guaranteed that until a data sync point is reached. Here that point is the implicit barrier at the end of the loop - if you put nowait this condition is no longer satisfied. What the author is saying is that B, AS A WHOLE OBJECT, is not up to date until the end of the loop. – Ian Bush Jul 04 '15 at 07:01
  • 1
    But the author explicitly says that the loop will not work and that is bogus in my opinion. Maybe in connection with some other worksharing construct and a `nowait`, yes one must always be careful with `nowait`, but the author writes about that specific loop. – Vladimir F Героям слава Jul 04 '15 at 07:05
  • I agree with Vladimir - the example or wording of that section of the linked article is simply wrong. If the value of the whole of `B` was the issue, then the example would need to show some reference to the whole of `B` (and also the relevant OMP loop directives, including NOWAIT), which it doesn't. The example would also not need to include the definition and references to `A` in any form. – IanH Jul 04 '15 at 10:46
2

The statement in the referenced article is wrong.

Have a look at the paper "The OpenMP Memory Model", which explains the OpenMP memory model quite well.

Every thread is allowed to have its own "temporary view" on the shared part of the memory and the flow in both directions between that "view" and the "memory" may be delayed (although an update can be forced by flush calls etc.). But there are no restrictions within the same view. And since every iteration is guaranteed to be executed by only one thread, you can expect normal behavior within a single iteration. So the given example is guaranteed to work as expected.

mastov
  • 2,942
  • 1
  • 16
  • 33