4

In the code I am attempting to port to OpenMP, I have a parallelized loop nested in an outer loop. Depending on the iteration of the outer loop, I would like a particular array to be either shared or reduction(+). Is there a way to do this in Fortran?

Here's a mockup of what I want:

do i = 1, 2
  !$omp if(i.eq.1) parallel do reduction(+:foo)
  !$omp if(i.eq.2) parallel do shared(foo)
  do j = 1,j_max
    work on foo
  enddo
  !$omp end parallel
enddo

The discussion in openMP conditional pragma "if else" suggests that scheduling cannot be modified during execution. Is that also the case for shared/private/reduction/etc.?

One obvious course of action is to create foo_1 (reduction:+) and foo_2 (shared), copy foo_1 to foo_2 after the first iteration on i, and then have if statements within the loop over j to refer to the proper array. But that's not terribly elegant. I'm hoping there's a better/cleverer/cleaner way to do this.

Edit: for the unimaginative, here's the pseudocode version of my alternative

do i = 1, 2
  !$omp parallel do reduction(+:foo_1), shared(foo_2)
  do j = 1,j_max

    if( i .eq. 1 ) then
      work on foo_1
    else
      work on foo_2
    endif

  enddo
  !$omp end parallel

  foo_2 = foo_1
enddo
Community
  • 1
  • 1
Grundulum
  • 41
  • 3
  • In your first example you have two nested directives that open two nested parallel sections. I guess you need to have two `!$omp end parallel` directives in the end. – Dima Chubarov Sep 09 '15 at 17:15
  • 2
    Is there a reason you don't want to remove the outer for loop and simply write two separate OMP enabled for loops? Without any nesting, this problem is trivial. – NoseKnowsAll Sep 09 '15 at 21:20
  • Branch outside the loops unless you hate performance. – Jeff Hammond Sep 10 '15 at 02:14
  • @NoseKnowsAll The code I posted is radically stripped down and simplified compared to what I'm actually using. The OpenMP section is 2,000 lines long and nested four levels deep. The branch occurs based on the state of the second loop, and may not even occur on any given run based on the input supplied to the code. Removing the outer loop is much more of a pain than just using two arrays. – Grundulum Sep 10 '15 at 02:35

1 Answers1

1

As you don't mind having two parallel regions you could use orphaned directives - I find these great for organising the overall structure of large OpenMP codes. I mean something like

    i = 1
    !$omp parallel shared( i, foo, ... )
    Call do_the_work( i, foo, ... )
    !$omp end parallel
    i = 2
    !$omp parallel shared( i, ... ) reduction( +:foo )
    Call do_the_work( i, foo, ... )
    !$omp end parallel

...

    Subroutine do_the_work( i, foo, ... )
      !$omp do
      do j = 1,j_max
        work on foo
      enddo
    End Subroutine do_the_work

If the parallel region is as big as you say it probably wants to be in one or more routines by itself anyway.

Ian Bush
  • 6,996
  • 1
  • 21
  • 27
  • Thanks, but for my purposes using two arrays is cleaner than breaking off the parallel region into a subroutine. – Grundulum Sep 15 '15 at 03:54