2

There is a portion of my f90 program that is taking up a significant amount of compute time. I am basically looping through three matrices (of the same size, with dimensions as large as 250-by-250), and trying to make sure values stay bounded within the interval [-1.0, 1.0]. I know that it is best practice to avoid conditionals in loops, but I am having trouble figuring out how to re-write this block of code for optimal performance. Is there a way to "unravel" the loop or use a built-in function of some sort to "vectorize" the conditional statements?

do ind2 = 1, size(u_mat,2)
    do ind1 = 1,size(u_mat,1)
        ! Dot product 1 must be bounded between [-1,1]
        if (b1_dotProd(ind1,ind2) .GT. 1.0_dp) then
            b1_dotProd(ind1,ind2) = 1.0_dp
        else if (b1_dotProd(ind1,ind2) .LT. -1.0_dp) then
            b1_dotProd(ind1,ind2) = -1.0_dp
        end if
        ! Dot product 2 must be bounded between [-1,1]
        if (b2_dotProd(ind1,ind2) .GT. 1.0_dp) then
            b2_dotProd(ind1,ind2) = 1.0_dp
        else if (b2_dotProd(ind1,ind2) .LT. -1.0_dp) then
            b2_dotProd(ind1,ind2) = -1.0_dp
        end if
        ! Dot product 3 must be bounded between [-1,1]
        if (b3_dotProd(ind1,ind2) .GT. 1.0_dp) then
            b3_dotProd(ind1,ind2) = 1.0_dp
        else if (b3_dotProd(ind1,ind2) .LT. -1.0_dp) then
            b3_dotProd(ind1,ind2) = -1.0_dp
        end if
    end do
end do

For what it's worth, I am compiling with ifort.

veryreverie
  • 2,871
  • 2
  • 13
  • 26
fdrek1988
  • 33
  • 3
  • You might be interested in [this question](https://stackoverflow.com/questions/315306/is-if-expensive) and its various answers. – veryreverie Nov 13 '21 at 14:58

2 Answers2

4

You can use the intrinsic min and max functions for this.

As they are both elemental, you can use them on the whole array, as

b1_dotProd = max(-1.0_dp, min(b1_dotProd, 1.0_dp))

While there are processor instructions which allow min and max to be implemented without branches, it will depend on the compiler implementation of min and max as to whether or not this is actually done and if this is actually any faster, but it is at least a lot more concise.

veryreverie
  • 2,871
  • 2
  • 13
  • 26
3

The answer by @veryreverie is definitely correct, but there are two things to consider.

  1. A where statement is another sensible choice. Since it still is a conditional choice the same caveat of

whether or not this actually avoids branches and if it's actually any faster, but it is at least a lot more concise

still applies.

One example is:

    pure function clamp(X) result(res)
        real, intent(in) :: X(:)
        real :: res(size(X))
        where (X < -1.0)
            res = -1.0
        else where (X > 1.0)
            res = 1.0
        else
            res = X
        end where
    end function

  1. If you want to normalize to strictly 1 or -1, I would actually think about changing the datatype to integer. Then you can actually use a == 1 etc. without thinking about floating point equality problems. Depending on your code I would also think about cases where the dot product gets close to zero. Of course this point only applies, if you are only interested in the sign.
    pure function get_sign(X) result(res)
        real, intent(in) :: X(:)
        integer :: res(size(X))
        ! Or use another appropiate choice to test for near_zero
        where (abs(X) < epsilon(X) * 10.)
            res = 0
        else where (X < 0.0)
            res = -1
        else where (X > 0.0)
            res = +1
        end where
    end function
mcocdawc
  • 1,748
  • 1
  • 12
  • 21
  • 2
    As I understand the code fragment, the value is to be clamped to the interval [-1,1]. That doesn't mean "integer". – francescalus Nov 16 '21 at 10:35
  • 1
    Just a nitpick, but can `where` constructs be made branchless? `min` and `max` can be implemented [using x86 instructions](https://stackoverflow.com/a/55111487/8559300) to avoid branches altogether, but I'd be surprised if a compiler could do this for `where`. – veryreverie Nov 16 '21 at 16:52