If or function pointers in fortran

Question

as it is so common with Fortran, I'm writing a massively parallel scientific code. In the beginning of my code I read my configuration file which tells me which type of solver I want to use. Now that means that in a subroutine (during the main run) I have

if(solver.eq.1)then
  call solver1()
elseif(solver.eq.2)then
  call solver2()
else
  call solver3()
endif

Edit to avoid some confusion: This if is inside my time integration loop and I have one that is inside 3 nested loops.

Now my question is, wouldn't it be more efficient to use function pointers instead as the solver variable will not change during execution, except at the initialisation procedure.

Obviously function pointers are F2003. That shouldn't be a problem as long as I use gfortran 4.6. But I'm mainly using a BlueGene P, there is a f2003 compiler, so I suppose it's going to work there as well although I couldn't find any conclusive evidence on the web.

If you are working on a BlueGene why aren't you using XLFortran ? It is very well documented, you could start here http://www-01.ibm.com/software/awdtools/fortran/ — High Performance Mark, Mar 22 '12 at 20:00
Sorry for not being explicit. Of course I use XLFortran on the BG. And why did I not find this page today. Must have been looking for the wrong keywords. It clearly states that procedure pointers are ok. Thx. — Azrael3000, Mar 22 '12 at 20:02
for this specific situation I'd use a pre-processor and get the conditionals out of the executable all together. — agentp, Oct 17 '16 at 20:10

score 3 · Accepted Answer · answered Mar 22 '12 at 19:53

Knowing nothing about Fortran, this is my answer: The main problem with branching is that a CPU potentially cannot speculatively execute code across them. To mitigate this problem, branch prediction was introduced (which is very sophisticated in modern CPUs).

Indirect calls through a function pointer can be a problem for the prediction unit of the CPU. If it can't predict where the call will actually go, this will stall the pipeline.

I am quite sure that the CPU will correctly predict that your branch will always be taken or not taken because it is a trivial case of prediction.

Maybe the CPU can speculate across the indirect call, maybe it can't. This is why you need to test which is better.

If it cannot, you will certainly notice in your benchmark.

In addition, maybe you can hoist the if test out of your inner loop so it won't be called often. This will make the actual performance of the branch irrelevant.

Seeing your edit, I can only stress my last recommendation: Duplicate the inner loops three times and do the if-test outside of that loop. This will move the branch off of the hot path. — usr, Mar 22 '12 at 22:14
Clearly that is an option. But that will make the code unreadable if, for example, you have two different properties with 3 options each that makes 9 different ifs to maintain. — Azrael3000, Mar 24 '12 at 10:34
Maybe you can push the loop into the solver. Then you can all the solver through a function pointer and still have no performance loss. — usr, Mar 24 '12 at 11:14

High Performance Mark · Answer 2 · 2012-03-22T21:15:43.800

2

If you only plan to use the function pointers once, at initialisation, and you are running codes on a BlueGene, isn't your concern for the efficiency mis-directed ? Generally, any initialisation which works is OK, if it takes 1sec instead of 1msec it's probably going to have 0 impact on total execution time.

Code initialisation routines for clarity, ease of modification, that sort of thing.

EDIT

My guess is that using function pointers rather than your current code will have no impact on execution speed. But it's just a (educated perhaps) guess and I'll be very interested in any data you gather on this question.

edited Mar 22 '12 at 21:15

answered Mar 22 '12 at 20:04

High Performance Mark

77,191
7
105
161

sorry now that I read my question again I see your confusion. I will update it in a second. – Azrael3000 Mar 22 '12 at 20:16
Okay. Well I'll wait and see if somebody has some insight to share. Otherwise I'll try and scratch my scarce time together to do some benchmarks. I'll post them here. – Azrael3000 Mar 22 '12 at 21:37

Jannis Teunissen · Answer 3 · 2016-10-17T19:44:57.913

1

After a brief search I couldn't find the answer to the question, so I ran a little benchmark myself (see this link for the Makefile & dependencies). The benchmark consists of:

Draw random number to select method a, b, or c, which all perform a simple addition to their single integer argument
Call the chosen method 100 million times, using either a procedure pointer or if-statements
Repeat the above 5 times

The result with gfortran 4.8.5 on an CPU E5-2630 v3 @ 2.40GHz is:

Time per call (proc. pointer):       1.89 ns
Time per call (if statement):        1.89 ns

In other words, there is not much of a performance difference!

edited Oct 17 '16 at 19:44

answered Oct 17 '16 at 15:48

Jannis Teunissen

131
6

You don't supply the full benchmark, I had to make the subroutines myself, but I get `0 ns` for the `if statement` version and `-O3`. That is some difference! – Vladimir F Героям слава Oct 17 '16 at 16:11
I've added a link to the other files that are used in the benchmark! – Jannis Teunissen Oct 17 '16 at 19:45
If I move the module to the same file or use `-flto` I still get `0 s` for your benchmark. It is not a good benchmark. – Vladimir F Героям слава Oct 17 '16 at 19:54
Vladimir F, when you compile with `-flto` or when you put the module in the same file, your compiler is smart enough to compute the result by itself. You can make the computation more complicated (which reduces the precision of the benchmark), leave out `-flto`, or read the number of iterations from the command line to prevent this optimization. – Jannis Teunissen Oct 17 '16 at 20:30
Of course, I know, that's why I tried it. How do you know that the compiler is not playing any tricks without `-flto`? – Vladimir F Героям слава Oct 17 '16 at 21:39
Because if it triggers that so easily I don't trust that there are no more hidden tricks applied by the compiler. – Vladimir F Героям слава Oct 17 '16 at 22:52

score 1 · Answer 4 · edited May 23 '17 at 12:33

If you solver routines take a non-trivial runtime, then the trivial runtime of the IF statements is likely to be immaterial. If the sovler routines have a comparable runtine to the IF statement, then the total runtime is very short, so why do your care? This seems an optimization unlikely to pay off.

The first rule of runtime optimization is to profile your code is see what portions are consuming the runtime. Otherwise you are likely to optimize portions that are unimportant, which will accomplish nothing.

For what its worth, someone else recently had a very similar concern: Fortran Subroutine Pointers for Mismatching Array Dimensions

If or function pointers in fortran

4 Answers4

Linked

Related