template magic for wrapping C callbacks that take void* parameters?

Question

Say I'm using a C API that lets you register callbacks that take a void* closure:

void register_callback(void (*func)(void*), void *closure);

In C++ it's nice to have stronger types than void* so I want to create a wrapper that lets me register strongly-typed C++ callbacks instead:

template <typename T, void F(T*)>
void CallbackWrapper(void *p) {
  return F(static_cast<T*>(p));
}

void MyCallback(int* param) {}

void f(void *closure) {
  register_callback(CallbackWrapper<int, MyCallback>, closure);
}

This works alright. One nice property of this solution is that it can inline my callback into the wrapper, so this wrapping scheme has zero overhead. I consider this a requirement.

But it would be nice if I could make the API look more like this:

void f2() {
  RegisterCallback(MyCallback, closure);
}

I hope I can achieve the above by inferring template parameters. But I can't quite figure out how to make it work. My attempt so far is:

template <typename T>
void RegisterCallback(void (*f)(T*), T* closure) {
  register_callback(CallbackWrapper<T, f>, closure);
}

But this doesn't work. Anyone have a magic incantation that will make f2() work above, while retaining the zero-overhead performance characteristic? I want something that will work in C++98.

I'm having trouble seeing the point of this. What benefit is a wrapper if it's getting casted to `void*` anyways? — Pubby, May 17 '13 at 06:29
The wrapper saves the C++ function from having to do a static_cast. — Josh Haberman, May 17 '13 at 06:38
It can also type-check to make sure that the closure you pass when you register the callback is the same type that the callback takes as its parameter. — Josh Haberman, May 17 '13 at 06:41
Actually this is undefined. Your wrapper is using a C++ ABI. The C callback like all C code only uses a C ABI. If this works you just happen to be getting lucky that the ABI are aligned. — Martin York, May 17 '13 at 06:53
@LokiAstari The problem would still hold if, instead of a *literal* C API, the OP were faced with a *C-style* API. — Luc Danton, May 17 '13 at 06:56
Are you looking for something like boost::any? It can at least add some level of sanity - you can make sure that the any has the correct type when converting it out from a void*. — Zac, May 17 '13 at 07:10
@LokiAstari: what's undefined about it? I convert it to a void* to pass to C, C passes it back to C++ as a void* which I then static_cast<> to a more specific pointer type. Nothing undefined about it. — Josh Haberman, May 17 '13 at 07:14
@JoshHaberman: Its below the language level. The ABI defines where parameters and results are put. How the stack frame is cleaned up. What is on the stack frame for exception handling. etc etc etc. Your code is passing a C++ function with a C++ ABI to a function that is expecting a function with a C ABI. You just happen to be getting lucky that it works. — Martin York, May 17 '13 at 07:16
@LokiAstari: I know about ABIs, but a function declared "extern C" follows a C ABI. The ability of C++ to call "extern C" functions is well-established. — Josh Haberman, May 17 '13 at 07:20
@JoshHaberman: Unfortunately template functions can not be declared extern "C". Nor do I see any `extern "C"` declarations above. — Martin York, May 17 '13 at 07:23
@JoshHaberman A function declaration in e.g. an `extern "C"` block taking a function pointer parameter is declared to accept a pointer to a function with C language linkage, which `CallbackWrapper` isn't. — Luc Danton, May 17 '13 at 07:23
@LucDanton: are you saying that a regular C++ function (ie. with C++ linkage) can never be passed as a function pointer to a C function? I've never heard of this, and I can't find any mention of it in the standard. Do you have a reference for this? — Josh Haberman, May 17 '13 at 07:40
@JoshHaberman Well, a C++ program can declare e.g. `typedef void callback_type(); extern "C" void f(callback_type* func, void* data);`, ensuring thus that `f` is a C linkage function taking a pointer to a C++ function. As you can probably tell from experience, this is not usually done. (Language linkage is in 7.5 for either standard.) — Luc Danton, May 17 '13 at 07:45
@JoshHaberman: Its because C and C++ are different languages. You can't show (prove) a negative. That's why the standard defines what you can do. Not a list of things it can't do. What you need is a thing in the standard that says that C has been made binary compatible with C++ for this code to work. There is no such clause in the standard. — Martin York, May 17 '13 at 07:52
@JoshHaberman: You can see [this question](http://stackoverflow.com/q/15536488/315052) for some discussion about this issue. — jxh, May 17 '13 at 07:53

jxh · Answer 1 · 2013-05-17T15:00:14.847

4

This template function improves the syntax marginally.

template <typename T, void F(T*)>
void RegisterCallback (T *x) {
    register_callback(CallbackWrapper<T, F>, x);
}

int x = 4;
RegisterCallback<int, MyCallback>(&x);

If you are willing to use a functor rather than a function to define your callback, then you can simplify things a bit more:

#ifdef HAS_EXCEPTIONS
# define BEGIN_TRY try {
# define END_TRY } catch (...) {}
#else
# define BEGIN_TRY
# define END_TRY
#endif

template <typename CB>
void CallbackWrapper(void *p) {
    BEGIN_TRY
    return (*static_cast<CB*>(p))();
    END_TRY
}

struct MyCallback {
    MyCallback () {}
    void operator () () {}
};

template <typename CB>
void RegisterCallback (CB &x) {
    register_callback(CallbackWrapper<CB>, &x);
}

MyCallback cb;
RegisterCallback(cb);

But, as others have mentioned, you run the risk of the code not porting correctly to a system where the C ABI and C++ ABI differ.

edited May 17 '13 at 15:00

answered May 17 '13 at 08:13

jxh

69,070
8
110
193

Thanks you, both for this suggestion and for the reference to the ABI issue. I think it's unfortunate that the standard says this since in practice the ABIs never differ and tons of code relies on this (for example, passing a C++ function to pthread_create(). Realistically I don't think anyone will ever be able to enforce this rule. But still I am glad I know about it now. – Josh Haberman May 17 '13 at 08:35
No other ideas for making the template call any prettier while still allowing the inlining? – Josh Haberman May 17 '13 at 08:36
@JoshHaberman: I am unable to figure out a way to pull out the parameter type as a trait of the function, so both template arguments are needed. I had originally intended to allow only the function name to be passed in, but failed. – jxh May 17 '13 at 08:37
Can't you at least switch the two template arguments and use automatic deduction for the second one then (`T`): `template ...` -> `RegisterCallback(&x);`? Wait, doesn't work, since `void F(T*)` wouldn't know `T` then. – Christian Rau May 17 '13 at 08:42
@ChristianRau: I thought that was a C++11 feature. Was I mistaken? – jxh May 17 '13 at 08:45
@user315052 No, automatic template argument deduction has always worked (that's what things like `std::make_pair` are for), but the way I described wouldn't work anyway, since `F(T*)` has to know about `T` beforehand. – Christian Rau May 17 '13 at 08:47
@ChristianRau: Ah, that's what you meant. – jxh May 17 '13 at 09:00
1

@JoshHaberman: I updated the answer with something that should make things simpler, but you would be defining your callbacks as functors. – jxh May 17 '13 at 09:01
Thank you, your suggestions have been most helpful! – Josh Haberman May 17 '13 at 09:07
@JoshHaberman: `passing a C++ function to pthread_create()`. Yes this happens a lot. But when you work at a place with experienced programmers they slap your hand at code review time and tell you to go away and do it correctly. People usually learn after their first time and stop doing it. The problem is that it does break on a lot of systems out there. Your experience is limited to a one systems were it does not break. But in my experience this breaks in 50% of the systems when I was working at Veritas. There are lots more compilers than `clang/g++` and more OS's than you can shake a stick at – Martin York May 17 '13 at 14:49
PS. You should probably catch exceptions to prevent them from being propagated back across stack frames that can't handle exceptions in C libraries. – Martin York May 17 '13 at 14:51
@LokiAstari: Please tell me what systems you know of that fail to run this program: https://gist.github.com/anonymous/5599924 – Josh Haberman May 17 '13 at 15:43
PS. The code you posted should work everywhere. It does not illustrate the point we were talking about. In your posted code `run()` knows that the passed function is a C++ function even though it is a C function (because it is compiled inside a C++ compilation unit). What you meant is to put the `extern "C" int times2(int x) { return x * 2; }`. – Martin York May 17 '13 at 17:54
@LokiAstari: most platforms specify a combined C and C++ ABI. I want to know if there is actually any good reason to have separate ABIs, or if it ever happens in practice. If neither of these is true, and if a large body of code relies on them being the same (which it does), then I think it is reasonably likely that a future version of the standard will consider this a defect and correct it. I'm still deciding whether I consider this a safe bet. – Josh Haberman May 17 '13 at 17:57
@LokiAstari: I think my example does illustrate the point: it does not matter what the compiler "knows", if the function types are different then calling a C++ function from an "extern C" function is undefined behavior. The compiler does not know whether this function will also be called directly from C with a C function pointer. – Josh Haberman May 17 '13 at 18:01
I personally agree there is no good reason to have different ABI's for the language. – Martin York May 17 '13 at 18:04
The problem is that they exist. Usually the C++ ABI is defined by compiler vender (and there can by multiple on a system). The C ABI is usually defined by the hardware vendor (with input from OS) if they have a compiler the C++/C version will usually work well together. But that does not mean third party C++ venders are going to comply to the hardware venders C++ ABI (they are free to choose their own (they might have better optimizations because of their ABI enhancements)). The C ABI though is specific to the platform "hardware/OS" combination. – Martin York May 17 '13 at 18:07
PS. Your code is wrong (well correct). It should work everywhere because you are passing a C++ function pointer to an a function expecting a C++ function pointer. It knows how to build the correct call to a C++ function. As I pointed out above. If you put the `"extern "C"` on `times2()` then you get the incompatibility. – Martin York May 17 '13 at 18:11

score 1 · Accepted Answer · answered Sep 03 '14 at 23:19

I have discovered a better answer to this question than the other answers given to me here! (Actually it was another engineer inside Google who suggested it).

You have to repeat the function name twice, but that can be solved with a macro.

The basic pattern is:

// Func1, Func2, Func3: Template classes representing a function and its
// signature.
//
// Since the function is a template parameter, calling the function can be
// inlined at compile-time and does not require a function pointer at runtime.
// These functions are not bound to a handler data so have no data or cleanup
// handler.
template <class R, class P1, R F(P1)>
struct Func1 {
  typedef R Return;
  static R Call(P1 p1) { return F(p1); }
};

// ...

// FuncSig1, FuncSig2, FuncSig3: template classes reflecting a function
// *signature*, but without a specific function attached.
//
// These classes contain member functions that can be invoked with a
// specific function to return a Func/BoundFunc class.
template <class R, class P1>
struct FuncSig1 {
  template <R F(P1)>
  Func1<R, P1, F> GetFunc() { return Func1<R, P1, F>(); }
};

// ...

// Overloaded template function that can construct the appropriate FuncSig*
// class given a function pointer by deducing the template parameters.
template <class R, class P1>
inline FuncSig1<R, P1> MatchFunc(R (*f)(P1)) {
  (void)f;  // Only used for template parameter deduction.
  return FuncSig1<R, P1>();
}

// ...

// Function that casts the first parameter to the given type.
template <class R, class P1, R F(P1)>
R CastArgument(void *c) {
  return F(static_cast<P1>(c));
}

template <class F>
struct WrappedFunc;

template <class R, class P1, R F(P1)>
struct WrappedFunc<Func1<R, P1, F> > {
  typedef Func1<R, void*, CastArgument<R, P1, F> > Func;
};

template <class T>
generic_func_t *GetWrappedFuncPtr(T func) {
  typedef typename WrappedFunc<T>::Func Func;
  return Func().Call;
}

// User code:

#include <iostream>

typedef void (generic_func_t)(void*);

void StronglyTypedFunc(int *x) {
  std::cout << "value: " << *x << "\n";
}

int main() {
  generic_func_t *f = GetWrappedFuncPtr(
      MatchFunc(StronglyTypedFunc).GetFunc<StronglyTypedFunc>());
  int x = 5;
  f(&x);
}

This is not short or simple, but it is correct, principled, and standard-compliant!

It gets me what I want:

The user gets to write StronglyTypedFunc() taking a pointer to a specific thing.
This function can be called with a void* argument.
There is no virtual function overhead or indirection.

Does `generic_func_t` account for the potential C-C++ ABI compatibility issues discussed in the other comments? — Uyghur Lives Matter, Oct 08 '14 at 14:53

Martin York · Answer 3 · 2013-05-17T14:50:31.677

-3

Why not make your closure a real closure (by including real typed state).

class CB
{
    public:
        virtual ~CB() {}
        virtual void action() = 0;
};

extern "C" void CInterface(void* data)
{
    try
    {
        reinterpret_cast<CB*>(data)->action();
    }
    catch(...){}
    // No gurantees about throwing exceptions across a C ABI.
    // So you need to catch all exceptions and drop them
    // Or probably log them
}

void RegisterAction(CB& action)
{
    register_callback(CInterface, &action);
}

By using an object you can introduce real state.
You have a clean C++ interface with correctly types objects.
Its easy to use you just derive from CB and implement action().

This also has the same number of actual function calls as you use. Because in your example you pass a function pointer to the wrapper (which can't be inlined (it can but it will take more static analysis then current compilers do)). Apparently it does inline.

edited May 17 '13 at 14:50

answered May 17 '13 at 07:03

Martin York

257,169
86
333
562

1

@KennyTM: No it does not. In fact I am assuming he can't modify that function (as it is part of a library). Otherwise we would not be going through this processes we would just change the underlying library to C++ :-) – Martin York May 17 '13 at 07:13
2

This solution imposes a virtual function call overhead. I need something that can be inlined (I specified this as a requirement). – Josh Haberman May 17 '13 at 07:17
2

@JoshHaberman: Yes you did. But your code is not inlining either. So it does not seem fair to add other constraints that you are not really enforcing on your own code. Also the cost of a virtual function call over a normal function call is insignificant. Try timming it. So the cost of this is exactly the same as your wrapper idea. The difference is this works. – Martin York May 17 '13 at 07:21
2

My code *is* inlining MyCallback() inside CallbackWrapper. I verified this by looking at the assembly language output. I want something that will continue to do this with nicer syntax. – Josh Haberman May 17 '13 at 07:34
Also the cost is not "exactly the same." A virtual function call has a cost, and I have measured it, and in my application it is significant. It is very aggravating when people answer your question by telling you that they know your requirements better than you do. – Josh Haberman May 17 '13 at 07:36
1

Yes you are correct there is a theoretical cost. But with all the other things a CPU is doing in a modern operating system it is very hard to time and measure. In most situations you can not time it. If you claim to have timed the difference between a normal function call and a virtual function call and seen a difference then please show me the code that produces these results. I would love to see it. – Martin York May 17 '13 at 07:39
What I find aggravating is people writing code that does not work (portably) but having to deal with sub standard code for a long time makes you annoyed at the smallest things so I can be very annoying. :-) – Martin York May 17 '13 at 07:42
it's very easy to show; here's a benchmark I whipped up in 10 minutes that shows a 20% slowdown with virtual functions: https://gist.github.com/anonymous/5597659 – Josh Haberman May 17 '13 at 08:11
Also you're wrong to say that MyCallback isn't getting inlined in my example. Here I demonstrate that it is: https://gist.github.com/anonymous/5597712 – Josh Haberman May 17 '13 at 08:21
@JoshHaberman: As I suspected your timing is wrong. You are comparing 1 function call Vs 1 function call and 1 virtual function call. So not an apples to apples comparison. So when you correct for that the times are the same. – Martin York May 17 '13 at 14:34
@JoshHaberman: Yes it does seem to be inlined. OK. I apologize I did not think the static analysis would work because you were passing a function pointer (rather than a functor). But it does. So you now do get that performance increase (but not because the call is virtual) because you are making one less call. – Martin York May 17 '13 at 14:38
@JoshHaberman But still does not change the fact the code is broken. So fast running broken code is still not very usfull. – Martin York May 17 '13 at 14:40
2

You are *still wrong* about the timing. In my first example I did not create a separate function for the direct function call because it's obvious it could be inlined. But if this is not obvious to you, I created this version, which shows exactly the same results. Note how in the assembly cmp and icmp are exactly the same: https://gist.github.com/anonymous/5599803 – Josh Haberman May 17 '13 at 15:28
You corrected it and those now run in the same time no difference. But I am still correct in my assertion. There is no measurable difference between a normal function call and a virtual function call in terms of cost. If you are running on an embedded device with no OS then maybe you could potentially see it (actually time it). – Martin York May 17 '13 at 17:43
The Gist I linked shows a 20% difference between virtual and non-virtual function calls on x86-64. Why do you say there is "no measurable difference"? – Josh Haberman May 17 '13 at 17:52
Because I compiled ran and timed it. You are probably still timing your original code. Which does have a 20% difference because of the difference in calls. In your original code you had a 20% difference. You added a function call and still see a 20% difference so you assume the function call is now free! – Martin York May 17 '13 at 18:09
I think you are probably measuring the wrong thing. I've collapsed this into one single benchmark that tests both and prints the results. If you run this program (properly optimized) on x86-64 I will be very surprised if it prints out anything less than 10% virtual function call overhead (for me it prints 26%): https://gist.github.com/anonymous/5602346 – Josh Haberman May 17 '13 at 22:17
Sighh. Are we still discussing this! Yes you get a 15% increase in speed (if you remove the function call). I have added the appropriate assembly to the gist so you can validate. Please check your assembly I am sure you will see exactly the same results. That's not what I claimed. – Martin York May 20 '13 at 02:09
I see what you are saying now, you are saying the test is unfair because the direct function call can get inlined. But the fact that virtual calls cannot be inlined is part of their cost, and part of why you are wrong for saying the cost is "exactly the same." But even if we forget that and prevent the direct call from being inlined, you are *still* wrong: https://gist.github.com/haberman/5621682 – Josh Haberman May 21 '13 at 17:40
1

If I sound grumpy, it's because you're repeatedly claiming something that I know to be false. – Josh Haberman May 21 '13 at 17:54
@JoshHaberman: OK I agree that inlining makes function calls faster. and that is an advantage in this type of situation where the function is small and the cost of the call is greater than the cost of the function. – Martin York May 21 '13 at 18:00
@JoshHaberman: I disagree (its not false) that a normal function call is more expensive (measurably) then a virtual function call. The cost of an offset lookup on page that is more than likely to be in level-0 cache (vtables are often accessed) is exceedingly fast (in the order of cycles). Measuring this difference in a real OS is next to imposable because things like page faults will cause so much variance in the timing that cycle differences are drowned out. – Martin York May 21 '13 at 18:04
@JoshHaberman: The fact that you can't show me a working example that has a measurable difference is what I have seen throught my career. People make the claim that virtual calls are much slower (because it seems so obvious). But when challenged to produce a bit of code that shows this. They never can. If it is **false** then as an engineer you should be able to show me the proof. Knowing it is false because you have a feeling is not valid. Knowing it is false because you have measured it. Then we have something to discuss. – Martin York May 21 '13 at 18:05
I just showed you the proof. Nothing about my argument is based on feelings. My last gist prevents the compiler from inlining the direct function call. You could not get a more apples-to-apples comparison. Look at the assembly yourself. The virtual function call still has a measured 10% overhead. – Josh Haberman May 21 '13 at 18:16
Also the result is consistently 10% on my machine, it is not very noisy at all. – Josh Haberman May 21 '13 at 18:17
Besides, the burden of proof is on you to say that two things are exactly the same despite the fact that one has to do strictly more work. The natural assumption is that the one that has to do more work will be slower. – Josh Haberman May 21 '13 at 18:20
@JoshHaberman: Agreed. The virtual call takes one (or maybe a couple on some hardware) more instruction. I am saying the cost of that instruction (since it merly accesses a cached page) is lost in the noise of a running operating system. If we were on an embedded device with no OS or interrupts then a virtual call would be measurably different. You have only showed what I already agree is true. That inlining can be useful. – Martin York May 21 '13 at 18:22
1

Also it's well-established in the literature that virtual function dispatch has a measurable direct cost (5% according to this paper: http://www.cs.ucsb.edu/~urs/oocsb/papers/oopsla96.ps) – Josh Haberman May 21 '13 at 18:25
A second ago you said "But when challenged to produce a bit of code that shows this. They never can." Then I showed you some. But you ignore it -- maybe you've been ignoring it throughout your career? Then you said it was still just about inlining even though there is NO inlining in my last benchmark. It is clear that no evidence will change your beliefs. – Josh Haberman May 21 '13 at 18:44
@JoshHaberman: Missed the second gist I'll look at it and read the paper. You did check that the code was not being inlined didn't you? – Martin York May 21 '13 at 19:27

template magic for wrapping C callbacks that take void* parameters?

3 Answers3

Linked