generating efficient sub block matrices to avoid duplicated code

Question

The following code works correctly. I generate a 9x9 matrix, which consists of 9 separate 3x3 sub block matrices. However, lots of code seems duplicated and I'm using an excessive amount of DO loops it seems.

Is there any way I can generate the same 9x9 matrix but without duplicating lots of code and minimizing the amount of do loops?

My actual problem involves a square matrix much larger than a 9x9 matrix, so this TestCode is not so general or useful yet even though it works.

The case of the matrix being 9x9 here is just for a minimal, complete, verifiable example. In general, I need to do this for an n by n matrix where each sub-block is of size sqrt(n) by sqrt(n).

PROGRAM TestCode

IMPLICIT NONE
INTEGER :: i, j, m!matrix indices (i,j)
INTEGER,PARAMETER :: n = 9 ! matrix is 9x9
DOUBLE PRECISION :: KE(n,n)
REAL :: nn 

nn = n
m = SQRT(nn)

DO i = 1, m
    DO j = 1, m
        IF( i .EQ. j) THEN 
            KE(i,j) = -4
        ELSEIF ( ABS(i-j) .EQ. 1) THEN
            KE(i,j) = 1
        ELSE 
            KE(i,j) = 0
        END IF
    END DO
END DO

DO i = 4,6
    DO j = 4,6
        IF( i .EQ. j) THEN 
            KE(i,j) = -4
        ELSEIF ( ABS(i-j) .EQ. 1) THEN
            KE(i,j) = 1
        ELSE 
            KE(i,j) = 0
        END IF
    END DO
END DO

DO i = 7,9
    DO j = 7,9
        IF( i .EQ. j) THEN 
            KE(i,j) = -4
        ELSEIF ( ABS(i-j) .EQ. 1) THEN
            KE(i,j) = 1
        ELSE 
            KE(i,j) = 0
        END IF
    END DO
END DO

DO i = 4,6
    DO j = 1,m
        IF( ABS(i-j) .EQ. m) THEN 
            KE(i,j) = 1
        ELSE 
            KE(i,j) = 0
        END IF
    END DO
END DO

DO i = 7,9
    DO j = 1,m
        KE(i,j) = 0
    END DO
END DO

DO i = 1,m
    DO j = 4,6
        IF ( ABS(i-j) .EQ. m) THEN
            KE(i,j) = 1
        ELSE
            KE(i,j) = 0
        END IF
    END DO
END DO

DO i = 7,9
    DO j = 4,6
        IF ( ABS(i-j) .EQ. m) THEN
            KE(i,j) = 1
        ELSE
            KE(i,j) = 0
        END IF
    END DO
END DO

DO i = 1,m
    DO j = 7,9
        KE(i,j) = 0
    END DO
END DO

DO i = 4,6
    DO j = 7,9
        IF( ABS(i-j) .EQ. m) THEN
            KE(i,j) = 1
        ELSE
            KE(i,j) = 0
        END IF
    END DO
END DO

END PROGRAM

Does [this other question](https://stackoverflow.com/q/34262261/3157076) help? — francescalus, Feb 29 '20 at 20:38
@francescalus Thanks for suggesting that question. I had found that post before I posted this. I'm not entirely sure what you're doing in your answer with the array constructors so I didn't want to just copy the code since I didn't understand it at all. I understand the code that I wrote above, so I'm trying to simplify it based off something I understand myself. If you think my question is a duplicate though, I do understand. However, would it be easier for me to use VladimirF's answer instead of your array constructor method answer? — Jeff Faraci, Feb 29 '20 at 20:45
@francescalus Also, The case of the matrix being 9x9 here is just for a minimal, complete, verifiable example. In general, I need to do this for an n by n matrix where each sub-block is of size sqrt(n) by sqrt(n). So I think my question may be slightly different than that post you suggest. Although I may just be missing the connection. Thanks — Jeff Faraci, Feb 29 '20 at 20:48
Oops, I didn't recognise my answer so I'll have to reread the question/answers again to answer your points. (Not a duplicate suggestion as I haven't fully time to go over.) — francescalus, Feb 29 '20 at 21:00
One thing you can do to make it faster is switch the indices, Fortran is column major. — albert, Feb 29 '20 at 21:15
Use a subroutine, that's what they are there for. Have the matrix and loop limits as arguments, and you should be away. Best would be to order the loops as albert suggets, that could give you an order of magnitude increase in speed. — Ian Bush, Feb 29 '20 at 22:17
If the same 3x3 is replicated, then maybe just copy it after the first instance is made? Most of the array is zero, so that is a one liner as well. — Holmz, Mar 01 '20 at 08:27

score 2 · Accepted Answer · answered Mar 02 '20 at 15:56

Given the periodic nature of the data the submatrices and the whole block matrix can be filled via the PAD= argumant to the RESHAPE intrinsic.

program blox
   implicit none
   integer m
   integer n
   integer, allocatable :: zero(:), sp3(:,:,:)
   integer, allocatable :: b1(:,:), b2(:,:), b3(:,:)
   integer, allocatable :: iKE(:,:)
   character(80) fmt
   m = 4
   n = m**2
   zero = reshape([integer::],[m+1],pad=[0])
   b1 = reshape([integer::],[m,m],pad=[-4,1,zero(1:m-2),1])
   b2 = reshape([integer::],[m,m],pad=[1,zero(1:m)])
   b3 = reshape([integer::],[m,m],pad=zero(1:m+1))
   sp3 = reshape([integer::],[m,m,m-2],pad=b3)
   iKE = reshape(reshape([integer::],[m,m,m,m],pad=[b1,b2,sp3,b2],order=[1,3,2,4]),[n,n])
   write(fmt,'(*(g0))') '(',m,'(',m,'(',m,'i3:1x)/))'
   write(*,fmt) transpose(iKE)
end program blox

Notice how strings of zeros are created by padding out an empty array with zeros (zero) and then arrays with periodic data are filled by padding out an empty array array with a single period of data (bl, b2, and b3). Then a matrix consisting of copies of a block matrix is created by padding out an empty array with the block (sp3). Finally a block-periodic matrix is created by padding out an empty array with the sequence of blocks. The resulting matrix has to be read out in cross-order and then reshaped into the right dimensions (iKE).

Output:

 -4  1  0  0   1  0  0  0   0  0  0  0   0  0  0  0
  1 -4  1  0   0  1  0  0   0  0  0  0   0  0  0  0
  0  1 -4  1   0  0  1  0   0  0  0  0   0  0  0  0
  0  0  1 -4   0  0  0  1   0  0  0  0   0  0  0  0

  1  0  0  0  -4  1  0  0   1  0  0  0   0  0  0  0
  0  1  0  0   1 -4  1  0   0  1  0  0   0  0  0  0
  0  0  1  0   0  1 -4  1   0  0  1  0   0  0  0  0
  0  0  0  1   0  0  1 -4   0  0  0  1   0  0  0  0

  0  0  0  0   1  0  0  0  -4  1  0  0   1  0  0  0
  0  0  0  0   0  1  0  0   1 -4  1  0   0  1  0  0
  0  0  0  0   0  0  1  0   0  1 -4  1   0  0  1  0
  0  0  0  0   0  0  0  1   0  0  1 -4   0  0  0  1

  0  0  0  0   0  0  0  0   1  0  0  0  -4  1  0  0
  0  0  0  0   0  0  0  0   0  1  0  0   1 -4  1  0
  0  0  0  0   0  0  0  0   0  0  1  0   0  1 -4  1
  0  0  0  0   0  0  0  0   0  0  0  1   0  0  1 -4

Surprising that you really need the explicit form of this matrix, though. Usually you see this kind of object handled more indirectly via sparse matrix techniques.

Thanks for the detailed solutions. It compiles and runs yielding the exact results I was looking for. I need to think about what you're doing here a bit as this code is a bit advanced relative to what I'm used to doing. Why do you print out the Transpose of iKE and not just iKE? Aren't iKE and transpose(iKE) equal? I thought I needed this matrix because I am trying to generate the 2D Laplacian finite difference discretization matrix. Perhaps it is much more efficient to use sparse matrix techniques however I didn't think to do this... — Jeff Faraci, Mar 02 '20 at 18:01
I printed out `transpose(iKE)` because Fortran stores matrices in column-major order but they should be printed out in row major order unless you're used to reading matrices from their transposes. Of course it doesn't matter for symmetric matrices but it seems to me to be less surprising to see matrices printed out row major. If you go to some sort of sparse matrix technique you may be able to solve much bigger problems. [First link I tried](https://en.wikipedia.org/wiki/Discrete_Poisson_equation). — user5713492, Mar 02 '20 at 21:24
Thanks, your explanation makes sense. I have another question, what's the point of making sp3? Isn't it just an array of zeros? I'm confused of the difference between b3 and sp3. I can see what you're doing when you make b1,b2,b3. Also, why did you make them all integer, allocatable opposed to REAL? And did you need to make them allocatable - I never use allocatable arrays, I typically define the size at the beginning of the program. Thanks again for your help — Jeff Faraci, Mar 02 '20 at 22:38
There are a few different ways of packing copies of one array into another and rather than throw the book at you I just used one of them. `sp3` is `m-2` copies of `b3` and I wrote the example such that it would work even if `.NOT.all(b3==0)`. They are syntactically different: just try `sp3=b3` or `b3=sp3` even when `m==3` and they contain the same sequence of elements. I could have used `spread(0,1,m**2*(m-2))` in place of `sp3` thus saving 2 lines of code, in fact I could have replaced all of those `reshape`s with a single executable statement, but it would have been hard to read. — user5713492, Mar 03 '20 at 02:32
For making copies, the `PAD=` argument to `RESHAPE` says "if you don't have enough data, append copies until you get enough". `SPREAD` says 'make `NCOPIES=` copies of the `SOURCE=` array and make the new dimension `DIM=`. `EOSHIFT` can suck in `SHIFT=` copies of the `BOUNDARY=` array like Sagittarius A*. I think I could have avoided the cross-order indexing had I used it. Everything was allocatable so that the mesh size could be varied if needed. I needed integer output so it would fit in one screen. I could have used `double precision KE` instead of `integer iKE` at the end, though. — user5713492, Mar 03 '20 at 02:51

High Performance Mark · Answer 2 · 2020-03-01T15:39:36.383

There are a lot of ways of making block matrices in Fortran, the answers already pointed out by @francescalus show a couple and I've certainly got some in my toolbox. Here's another approach, perhaps simpler but satisfactory for OP's immediate requirements.

First, declare a variable for the blocks, in the following it's called blk and it's declared to be m*m. Then it's as simple as ...

 ke = 0.0  ! All elements will be 0.0 unless otherwise assigned

 ! Define the block on the diagonal, and assign it
 blk = RESHAPE([-4.0, 1.0, 0.0, 1.0, -4.0, 1.0, 0.0, 1.0, -4.0], [m,m])
 DO i = 1, n, m
    ke(i:i+m-1,i:i+m-1) = blk
 END DO

 ! Now the off-diagonal blocks
 blk = RESHAPE([1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0], [m,m])
 DO i = 4, n, m
    ke(i:i+m-1,i-m:i-1) = blk
    ke(i-m:i-1,i:i+m-1) = blk
 END DO

If I was going to use this in production code then I'd wrap it into a routine and give more consideration to matters such as using allocatable arrays, passing assumed-shape arrays, that kind of good stuff.

Other useful tools in the Fortran programmers armoury applicable here arechsift and eoshift. I might use the latter as follows. First, let's have an n-element array of doubles, called line, then

 ke = 0.0

 line = [0.0, 1.0, 0.0, 1.0, -4.0, 1.0, 0.0, 1.0, 0.0]

 ke((n/2)+1,:) = line
 DO i = 1, 4
    ke(i,:) = EOSHIFT(line,m+2-i)
    ke(n-i+1,:) = EOSHIFT(line,-(m+2-i))
 END DO

As to the matter of efficiency, I doubt that there is much between any of the approaches here, or in similar approaches. I suppose minimising the number of scans across the arrays would be a good idea, and visiting elements in memory-friendly order would be useful too. But they all have to visit each element at least once. I'd go for the most straightforward approach, that is the one I found most straightforward for the problem at hand; your choice for your problem might well be different. Only worry about efficiency if profiling shows there is a problem.

generating efficient sub block matrices to avoid duplicated code

2 Answers2