22

I wrote a test to measure the cost of C++ exceptions with threads.

#include <cstdlib>
#include <iostream>
#include <vector>
#include <thread>

static const int N = 100000;

static void doSomething(int& n)
{
    --n;
    throw 1;
}

static void throwManyManyTimes()
{
    int n = N;
    while (n)
    {
        try
        {
            doSomething(n);
        }
        catch (int n)
        {
            switch (n)
            {
            case 1:
                continue;
            default:
                std::cout << "error" << std::endl;
                std::exit(EXIT_FAILURE);
            }
        }
    }
}

int main(void)
{
    int nCPUs = std::thread::hardware_concurrency();
    std::vector<std::thread> threads(nCPUs);
    for (int i = 0; i < nCPUs; ++i)
    {
        threads[i] = std::thread(throwManyManyTimes);
    }
    for (int i = 0; i < nCPUs; ++i)
    {
        threads[i].join();
    }
    return EXIT_SUCCESS;
}

Here's the C version that I initially wrote for fun.

#include <stdio.h>
#include <stdlib.h>
#include <setjmp.h>
#include <glib.h>

#define N 100000

static GPrivate jumpBuffer;

static void doSomething(volatile int *pn)
{
    jmp_buf *pjb = g_private_get(&jumpBuffer);

    --*pn;
    longjmp(*pjb, 1);
}

static void *throwManyManyTimes(void *p)
{
    jmp_buf jb;
    volatile int n = N;

    (void)p;
    g_private_set(&jumpBuffer, &jb);
    while (n)
    {
        switch (setjmp(jb))
        {
        case 0:
            doSomething(&n);
        case 1:
            continue;
        default:
            printf("error\n");
            exit(EXIT_FAILURE);
        }
    }
    return NULL;
}

int main(void)
{
    int nCPUs = g_get_num_processors();
    GThread *threads[nCPUs];
    int i;

    for (i = 0; i < nCPUs; ++i)
    {
        threads[i] = g_thread_new(NULL, throwManyManyTimes, NULL);
    }
    for (i = 0; i < nCPUs; ++i)
    {
        g_thread_join(threads[i]);
    }
    return EXIT_SUCCESS;
}

The C++ version runs very slow compared to the C version.

$ g++ -O3 -g -std=c++11 test.cpp -o cpp-test -pthread
$ gcc -O3 -g -std=c89 test.c -o c-test `pkg-config glib-2.0 --cflags --libs`
$ time ./cpp-test

real    0m1.089s
user    0m2.345s
sys     0m1.637s
$ time ./c-test

real    0m0.024s
user    0m0.067s
sys     0m0.000s

So I ran the callgrind profiler.

For cpp-test, __cxz_throw was called exactly 400,000 times with self-cost of 8,000,032.

For c-test, __longjmp_chk was called exactly 400,000 times with self-cost of 5,600,000.

The whole cost of cpp-test is 4,048,441,756.

The whole cost of c-test is 60,417,722.


I guess something much more than simply saving the state of the jump-point and later resuming is done with C++ exceptions. I couldn't test with larger N because the callgrind profiler will run forever for the C++ test.

What is the extra cost involved in C++ exceptions making it many times slower than the setjmp/longjmp pair at least in this example?

  • Can you compare the resulting machine code? – Kerrek SB Jul 18 '15 at 01:36
  • @KerrekSB Sorry, the disassembly for both is too complicated for me to get anything meaningful, I can paste the whole asm somewhere if you need it. –  Jul 18 '15 at 01:45
  • 1
    All sorts of groovy info in here: http://stackoverflow.com/questions/13835817/are-exceptions-in-c-really-slow – user4581301 Jul 18 '15 at 01:50
  • The exception handling mechanism involves numerous function calls (e.g. see the Itanium ABI for an example) and allocations. `setjmp` doesn't. Note that you can throw things other than integers, so perhaps the comparison isn't entirely fair. – Kerrek SB Jul 18 '15 at 01:52
  • 5
    You are testing the performance of exceptions which should **not** be called very often. If your program is throwing out that many exceptions where it's causing a performance issue you have a serious freakin' problem. What you should be testing is using try/catch blocks vs not using try/catch and see what the cost of having exceptions _ready_ to handle a problem. That will show you the cost of having exceptions _ready_ for when they are really needed. – Captain Obvlious Jul 18 '15 at 01:57
  • 1
    Exceptions (properly done) are MUCH different from setjmp/longjmp. With exceptions you can actually handle problems and keep the program running. – Hot Licks Jul 18 '15 at 02:32
  • 1
    You will need to provide more info about your build of g++. In Windows, g++ can be built with 3 different exception handling mechanism: SetJmp/LongJmp, Win32 SEH, or Dwarf2. The latter two options are (I believe) zero overhead: if no exception is thrown then there is no runtime penalty. SJLJ wastes time setting up a setjmp when you enter a `try {` block. I don't know for sure but I suspect you would find that if you were using SJLJ, then you would get similar results to your C program; the SEH and Dwarf2 options make it faster for the most common use case (i.e. nothing thrown) – M.M Jul 18 '15 at 02:47
  • 2
    Note that exceptions involve all sorts of clean up work that `setjmp`/`longjmp` completely ignores. If you allocated memory between calling `setjmp` and then calling `longjmp`, you have probably leaked it when you call `longjmp`. Quite possibly you've leaked other resource (file descriptors, etc) too. Even if not in your test code, in the general case. Yes, `setjmp` and `longjmp` work, but they are a very crude exception handling mechanism in general. – Jonathan Leffler Jul 18 '15 at 04:08

1 Answers1

22

This is by design.

C++ exceptions are expected to be exceptional in nature and are optimized thusly. The program is compiled to be most efficient when an exception does not happen.

You can verify this by commenting out the exception from your tests.

In C++:

    //throw 1;

$ g++ -O3 -g -std=c++11 test.cpp -o cpp-test -pthread

$ time ./cpp-test

real    0m0.003s
user    0m0.004s
sys     0m0.000s

In C:

    /*longjmp(*pjb, 1);*/

$ gcc -O3 -g -std=c89 test.c -o c-test `pkg-config glib-2.0 --cflags --libs`

$ time ./c-test

real    0m0.008s
user    0m0.012s
sys     0m0.004s

What is the extra cost involved in C++ exceptions making it many times slower than the setjmp/longjmp pair at least in this example?

g++ implements zero-cost model exceptions, which have no effective overhead* when an exception is not thrown. Machine code is produced as if there were no try/catch block.

The cost of this zero-overhead is that a table lookup must be performed on the program counter when an exception is thrown, to determine a jump to the appropriate code for performing stack unwinding. This puts the entire try/catch block implementation within the code performing a throw.

Your extra cost is a table lookup.

*Some minor timing voodoo may occur, as the presence of a PC lookup table may affect memory layout, which may affect CPU cache misses.

Drew Dormann
  • 59,987
  • 13
  • 123
  • 180
  • 1
    "Your extra cost is a table lookup." is understating the matter. A table lookup in a hot table (in this code, the table is definitely hot, since the same exception is thrown from the same line over and over) will cost very little. One big cost `setjmp`/`longjmp` avoid is memory allocation; every thrown exception involves dynamic memory allocation (e.g. on `g++`, it calls `__cxa_allocate_exception`, which must be freed later), then the `throw` itself involves assembling a ton of stack unwinding info (most of which is thrown away without being used in this case) before leaping to handler code. – ShadowRanger Sep 11 '19 at 15:24