store multi dimensional arrays to be operated on along all the dimensions

Question

Foreword

The Fortran program that I'm writing should deal with 1D, 2D and 3D problems depending on ndims, which can be 1, 2 or 3 and is read from an input file.

In these cases the quantity/ies of interest can be stored in arrays (one could be named phi)

of rank dims (ALLOCATABLE(:) or ALLOCATABLE(:,:) or ALLOCATABLE(:,:,:)),
or in arrays of rank 3 (ALLOCATABLE(:,:,:) with third dimension to be set equal to 1 in 2D or both second and third dimensions equal to 1 in 1D);

both cases are explained well in this answer. The first approach seems more elegant to me, but in the following I assume the second one, which is definitely simpler.

These quantities have to be operated on by several subroutines (e.g. mysub) along the ndims dimensions (along "pencils" should give a graphic idea), so I should call something like

SELECT CASE (ndims)

! 3D case
CASE (3)
  DO j = ...
    DO k = ...
      CALL mysub(phi(:,j,k))
    END DO
  END DO
  DO i = ...
    DO k = ...
      CALL mysub(phi(i,:,k))
    END DO
  END DO
  DO i = ...
    DO j = ...
      CALL mysub(phi(i,j,:))
    END DO
  END DO

! 2D case
CASE (2)
  DO j = ...
    DO k = ...
      CALL mysub(phi(:,j,1))
    END DO
  END DO
  DO i = ...
    DO k = ...
      CALL mysub(phi(i,:,1))
    END DO
  END DO

! 1D case
CASE (1)
  DO j = ...
    DO k = ...
      CALL mysub(phi(:,1,1))
    END DO
  END DO
END SELECT

Actual question

Can anyone suggest me (or help me to to devise!) a different way of store phi (maybe involving derived data types?) so that I can collapse the preceding code as follows?

DO id = 1, ndims
  CALL mysub2(phi,id)
END DO

(Here mysub2 acts on mysub's place.)

So the question is how should I store phi, so that I can substitute the first code with the second one?

Maybe I could return to the foreword and decide to follow the point 1., in which case would be easier to write a generic interface. This would be, I think, just a way to "hide" exactly what the SELECT CASE would do. Which of the two (SELECT CASE/generic INTERFACE) would be more efficient?

Are these the only two ways to face this problem?

This is becoming something of a FAQ. Most recently asked yesterday -- http://stackoverflow.com/questions/38058080/fortran-choosing-the-rank-of-an-allocatable-array. — High Performance Mark, Jun 28 '16 at 11:41
The question are clearly related, but the answer to that wouldn't be an answer to this: that question, if I correctly understood, is about choosing the rank of an array; my question is much more about "the best way to write a code which is not dependent on the rank of the array it operates on". — Enlico, Jun 28 '16 at 13:44

score 1 · Answer 1 · answered Jun 28 '16 at 14:47

1

You probably want something like this:

program test

   integer :: j,ndims
   integer :: n ! rank of each dimension, could also be read from input an allocated separately

   type arr
      real(8) :: x(n) ! one array for each dimension
   end type

   type(arr),allocatable :: phi

   read(*,*) ndims
   allocate(phi(ndims))

   do j=1,ndims
      call mysub(phi(j)%x) ! acts on the array in dimension j
   end do

contains

   subroutine mysub(x)
   ...
   end subroutine

end program

answered Jun 28 '16 at 14:47

PeMa

1,559
18
44

Well, this can be a starting point. I hope you can help further. Your answer effectively shows how to do a `DO` cycle on the number of dimensions `ndims`. On the other side, doing so, `arr` is not going to contain the same informations `phi` would contain. – Enlico Jun 28 '16 at 16:17
I added some detail to the actual question. – Enlico Jun 28 '16 at 16:23
Yeah, I think I answered a bit fast ^^. – PeMa Jun 28 '16 at 17:11
I mean the easiest answer is to pack your first code into a subroutine `mysub2(phi,id)` then you can use `mysub2` in you main code. But that's probably not as elegant as you want it? – PeMa Jun 28 '16 at 17:24

score 1 · Accepted Answer · answered Jun 28 '16 at 21:10

1

Perhaps I have misunderstood, but I think the answer to the specific question is to not make any changes to the storage or declaration of phi at all.

In the original code, three dimensional data (differentiating rank of the data from rank of the array used to store the data) is processed in slices along the first dimension, then the second, then the third. Two dimensional data is processed along the first, then the second, and one dimensional data is processed along the first only.

So with id going from 1 to the number of dimensions in the data, consider the following implementation of mysub2:

SUBROUTINE mysub2(phi, id)
  TYPE(pink_elephant), INTENT(IN) :: phi(:,:,:)
  INTEGER, INTENT(IN) :: id

  INTEGER :: i, j, k

  SELECT CASE (id)
  CASE (1)
    DO j = ...
      DO k = ...
        CALL mysub(phi(:,j,k))
      END DO
    END DO

  CASE (2)
    DO i = ...
      DO k = ...
        CALL mysub(phi(i,:,k))
      END DO
    END DO

  CASE (3)
    DO i = ...
      DO j = ...
        CALL mysub(phi(i,j,:))
      END DO
    END DO

  END SELECT
END SUBROUTINE mysub2

~~

Generic interfaces can always be resolved at "compile time" - the specific procedure (not type bound) or binding (type bound) that will be invoked by a particular CALL statement or function reference can be determined just from looking at the declarations in the code.

If you have a situation where "runtime" information is going to affect the choice of procedure, then there has to be some other executable mechanism, other than or additional to the resolution of a generic, that comes into play - if statement, select case, dynamic dispatch, etc, etc, etc.

Asking whether generic resolution is more efficient than a executable decision is therefore not particularly meaningful - they are different things.

answered Jun 28 '16 at 21:10

IanH

21,026
2
37
59

(Then, maybe "**efficient**" is not the right word.) In your code the `SELECT CASE (ndims)` is performed each time `mysub2` is `CALL`ed. If the rank of `phi` was chosen using a `SELECT CASE (ndims)` (similarly to what's done in the answer I linked), than an `INTERFACE` block could be used to call 3 different `SUBROUTINE`s for the 3 different ranks. Wouldn't this be **faster**? – Enlico Jun 29 '16 at 07:24
1

A generic interface is just a naming convenience for putting different procedures behind the same identifier. Accessibility aside, you can simply change a generic name to the relevant specific name that the generic resolves to at compile time, and get exactly the same behaviour. Their use is not a question of execution efficiency, their use is a question of "which characters do I type in my source to reference a particular procedure". Are you actually asking "at what level of some nested iterative structure is it most efficient to make a particular decision"? – IanH Jun 29 '16 at 08:39
Maybe I'm a little bit confused. You say I can simply change a generic name to the relevant specific name. This seems so simple! I argue that the SELECT CASE would be slower. Am I wrong? If so, why? (About the last question in quotes, I didn't get it.) – Enlico Jun 29 '16 at 13:54
Another question to better understand your answer. You said a generic interfaces can always be resolved at "compile time". Than that if a run time information it's needed (as is the case of file reading, isn't it?) the interface is not enough. So you mean in this case the interface is not resolved at compile time? – Enlico Jun 29 '16 at 18:34
1

If you are making a decision based on information only available at runtime, then that decision has to be made at runtime, via `if`, `select case` or whatever, *at some stage*. When and how you make that runtime decision will affect performance. But resolution of a generic interface is not a runtime decision - the specific procedure or binding that a generic resolves to is selected on the basis of the **declared** type, kind and rank of the actual arguments in the reference to the generic, and the declared type, kind and rank of the actual arguments are always known at compile time. – IanH Jun 29 '16 at 20:40
Ok, maybe I got it. If the specific subroutine is to be called inside an iterative loop (many many times), then it'd be better to make the three procedures act on 1-, 2-, and 3-rank arrays, put them in an explicit interface, and move the select case before the iterative loop just to declare 1-, 2-, or 3-rank array depending on the read input (like in [this answer](http://stackoverflow.com/a/7504063/5825294)). Am I right? – Enlico Jun 30 '16 at 06:25
Are you referring to the first part of that answer only? That uses sequence association to associate an actual argument to a dummy argument of different rank. That is something that you could consider - though I wouldn't regard it as "collapsing" code - you are just going to replace what you have now with three different subroutines and also moving the location of the `select case`. Note that you cannot use sequence association and generic interfaces - because sequence association (which permits a rank mismatch) prevents using rank to work out which specific procedure to call. – IanH Jun 30 '16 at 07:25
An alternative to using sequence association would be to explicitly pass a section of the top level array - i.e. for the 2D case you pass `phi(:,:,1)`, 1D `phi(:,1,1)`. You would still have three specific procedures, with the procedure to be called selected by `select case` at the top level. In this case the rank of the actual and dummy arguments match for each procedure, so if you wanted you could hide the names of the specific procedures behind a generic interface (but the runtime decision is done by the high level `select case`, it has nothing to do with the generic interface.) – IanH Jun 30 '16 at 07:30
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/116077/discussion-between-enrico-maria-de-angelis-and-ianh). – Enlico Jun 30 '16 at 10:34

store multi dimensional arrays to be operated on along all the dimensions

2 Answers2