20

A common way of implementing OO-like code encapsulation and polymorphism in C is to return opaque pointers to a structure containing some function pointers. This is a very frequent pattern for example in the Linux kernel.

Using function pointers instead of function calls introduces an overhead which is mostly negligible due to caching, as has already been discussed in other questions.

However, with the new -fwhole-program and -flto optimization options for GCC (>4.6), things change.

libPointers.c

#include <stdlib.h>
#include "libPointers.h"

void do_work(struct worker *wrk, const int i) 
{
        wrk->datum += i;
}

struct worker *libPointers_init(const int startDatum)
{
        struct worker *wrk = malloc (sizeof (struct worker));

        *wrk = (struct worker) {
                .do_work = do_work,
                .datum = startDatum
        };

        return wrk;
}

libPointers.h

#ifndef __LIBPOINTERS_H__
#define __LIBPOINTERS_H__


struct worker {
        int datum;

        void (*do_work)(struct worker *, int i);
};

extern void do_work (struct worker *elab, const int i);

struct worker *libPointers_init(const int startDatum);


#endif //__LIBPOINTERS_H__

testPointers.c

#include <stdio.h>
#include "libPointers.h"


int main (void)
{
        unsigned long i;
        struct worker *wrk;

        wrk = libPointers_init(56);

        for (i = 0; i < 1e10; i++) {
#ifdef USE_POINTERS
                wrk->do_work(wrk,i);
#else
                do_work(wrk,i);
#endif
        }

        printf ("%d\n", wrk->datum);
}

Compiling with -O3, but without -flto -fwhole-program flags, testPointers execution takes around 25s on my machine, regardless whether USE_POINTERS is #defined or not.

If I turn on the -flto -fwhole-program flags, testPointers takes around 25s with USE_POINTERS #defined, but around 14s if a function call is used.

This is completely expected behavior, since I understand that the compiler will inline and optimize the function in the loop. I wonder, however, if there's a way of helping the compiler telling it that the function pointer is constant and so allowing it to optimize that case, too.

For those using cmake, here's how I compiled it

CMakeLists.txt

set (CMAKE_C_FLAGS "-O3 -fwhole-program -flto")
#set (CMAKE_C_FLAGS "-O3")
add_executable(testPointers
        libPointers.c
        testPointers.c
        )
Metiu
  • 1,677
  • 2
  • 16
  • 24
  • 1
    What if you capture the value of `wrk->do_work` in a local function pointer outside the loop, and then use that local variable inside the loop? – Greg Hewgill Nov 15 '12 at 18:42
  • How do you dare asking a good and meaningful question? :P (+1, this is interesting.) –  Nov 15 '12 at 18:42
  • 1
    One problem is that `do_work` is doing extremely little work. If it actually did something significant, the difference in calling speed would be harder to measure (and thus not very significant). – Bo Persson Nov 15 '12 at 18:49
  • I understand, but getter/setter functions are very very common and they do even less work. With whole-program optimization, we have the luxury of mediating the access to data and full optimization of those simple actions in the final binary. – Metiu Nov 15 '12 at 18:51
  • 1
    @GregHewgill I changed it this way, unfortunately with no difference int main (void) { unsigned long i; struct worker *wrk; wrk = libPointers_init(56); #ifdef USE_POINTERS void (*const f_do_work)(struct worker *, int i) = wrk->do_work; #endif for (i = 0; i < 1e10; i++) { #ifdef USE_POINTERS f_do_work(wrk,i); #else do_work(wrk,i); #endif } printf ("%d\n", wrk->datum); } – Metiu Nov 15 '12 at 19:01
  • C++ compilers can do devirtualization to inline virtual function calls. But I think it will be harder to coerce a C compiler to do what you want. – Mysticial Nov 15 '12 at 19:02

2 Answers2

11

The compiler can't inline a function unless it can determine that only one possible version of the function will be called. By calling through a pointer it's not trivially obvious that this is the case. It still might be possible for the compiler to figure it out, since if you follow the code there's only one possible value that the pointer could take; however this would be above and beyond what I'd expect the compiler to do.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • Yeah that was what I was stating by saying "this is completely expected behavior". I was wondering whether there was some combination of the const attribute which would tell the compiler that the function was not going to change, along the lines of "pure", "const", etc... This at least would help the common "non-virtual method" case. – Metiu Nov 16 '12 at 05:59
  • Another common keyword that comes to mind is the restrict keyword, which has a similar contract: stay assured that you can treat a pointer in some way. Would be nice to say "this callback is going to always point to this function" – Metiu Nov 16 '12 at 06:08
  • If you, as the programmer, know that a specific function will *always* be called in some situation, then you can simply call that function directly instead of using a pointer. – Greg Hewgill Nov 16 '12 at 09:42
0

If you are calling the function pointer in a loop, you can move the loop inside the function pointer.