A common way of implementing OO-like code encapsulation and polymorphism in C is to return opaque pointers to a structure containing some function pointers. This is a very frequent pattern for example in the Linux kernel.
Using function pointers instead of function calls introduces an overhead which is mostly negligible due to caching, as has already been discussed in other questions.
However, with the new -fwhole-program and -flto optimization options for GCC (>4.6), things change.
libPointers.c
#include <stdlib.h>
#include "libPointers.h"
void do_work(struct worker *wrk, const int i)
{
wrk->datum += i;
}
struct worker *libPointers_init(const int startDatum)
{
struct worker *wrk = malloc (sizeof (struct worker));
*wrk = (struct worker) {
.do_work = do_work,
.datum = startDatum
};
return wrk;
}
libPointers.h
#ifndef __LIBPOINTERS_H__
#define __LIBPOINTERS_H__
struct worker {
int datum;
void (*do_work)(struct worker *, int i);
};
extern void do_work (struct worker *elab, const int i);
struct worker *libPointers_init(const int startDatum);
#endif //__LIBPOINTERS_H__
testPointers.c
#include <stdio.h>
#include "libPointers.h"
int main (void)
{
unsigned long i;
struct worker *wrk;
wrk = libPointers_init(56);
for (i = 0; i < 1e10; i++) {
#ifdef USE_POINTERS
wrk->do_work(wrk,i);
#else
do_work(wrk,i);
#endif
}
printf ("%d\n", wrk->datum);
}
Compiling with -O3, but without -flto -fwhole-program flags, testPointers execution takes around 25s on my machine, regardless whether USE_POINTERS is #defined or not.
If I turn on the -flto -fwhole-program flags, testPointers takes around 25s with USE_POINTERS #defined, but around 14s if a function call is used.
This is completely expected behavior, since I understand that the compiler will inline and optimize the function in the loop. I wonder, however, if there's a way of helping the compiler telling it that the function pointer is constant and so allowing it to optimize that case, too.
For those using cmake, here's how I compiled it
CMakeLists.txt
set (CMAKE_C_FLAGS "-O3 -fwhole-program -flto")
#set (CMAKE_C_FLAGS "-O3")
add_executable(testPointers
libPointers.c
testPointers.c
)