Do gcc and clang/LLVM generate a hash table with a perfect hash function, keyed by all return addresses from exception throwing functions? Do they use the same algorithm (to generate the hash function) as gperf?
Asked
Active
Viewed 94 times
0
-
The short answer is: no. How could a hash table, of some kind, can possibly be of any use for implementing exception handling? – Sam Varshavchik Jul 30 '22 at 00:16
-
You can find out by yourself, by writing a sample piece of code in a .cc file, compiling it with `-O2 -S`, and looking at the generated code. – Bruno Haible Jul 30 '22 at 08:14
-
@SamVarshavchik if stackframe B was pushed on stackframe A, I assume the return address in B is the key in an associative lookup that returns the address of the exception handling code for A. Or, is there a better way? – WaltK Jul 30 '22 at 15:27
-
@BrunoHaible can you see the answer in this output? https://gist.github.com/wkaras/e11f8e8b578f02e1eeb017b83c76166a – WaltK Jul 30 '22 at 15:28
-
Given that the stack needs to be unwound in a very specific order, what would some hashtable, that has its own overhead to maintain, including dynamic memory allocation. would actually accomplish that simply unwinding the stack via a linked list structure (on the stack itself) that links all the exceptions and the objects that need to be destroyed, in order, as the stack is unwound, cannot accomplish? Especially that this will not need any dynamic memory allocation, whatsoever? – Sam Varshavchik Jul 30 '22 at 15:41
-
@WaltK When I compile the sample code from your gist using `gcc -O2 -S` or `clang -O2 -S`, I see the exception tables: the data tables with `.uleb128` instructions. These are data structures that map instruction pointer to exception handler. It's not a hash table, but rather a range-based table, to save space. If it were represented as a hash table, it would be huge. – Bruno Haible Jul 30 '22 at 17:33
-
@SamVarshavchik not sure of how the details of that would work? Would it have zero instructions executed, for a function that neither caught nor threw, in the non-throwing case? – WaltK Jul 30 '22 at 18:16
-
@BrunoHaible what is a ranged-based table? Is it a sorted array? – WaltK Jul 30 '22 at 18:18
-
Why would a function in the call chain that doesn't catch the exception getting thrown, nor does have any object in automatic scope that needs to be destroyed when the exception gets thrown and caught earlier in call frame, care that an exception was thrown? – Sam Varshavchik Jul 30 '22 at 18:19
-
@SamVarshavchik please see my gist above. It needs to destroy local objects. – WaltK Jul 30 '22 at 18:28
-
@BrunoHaible with a perfect hashing approach, the array of exception handling entry points could be simply an array of pointers. But the const data needed by the perfect hashing function might be large. I know nothing about perfect hashing functions except that they exist. – WaltK Jul 30 '22 at 18:29
-
Did you read the part of my comment that mentioned "object in automatic scope"? That's what "local object" means. This is not a complicated concept: a linked list on stack, each time an object gets created in automatic scope an appropriate entry is added to the stack, pointing to the code that destroys the object in the event of an exception. A `catch` block pushes the catch handler+matching object metadata onto the linked list. A thrown exception unwinds the linked list, destroying each object, until a suitable entry for the catch handler is found. Simple. No need for some weird hash table. – Sam Varshavchik Jul 30 '22 at 18:34
-
@SamVarshavchik fair enough but the implementation you describe requires intermediate functions (that neither directly throw or catch) to execute instructions in the no exception case (for handling of potential exceptions). – WaltK Jul 30 '22 at 19:12
-
That was metaphorical. Each address in the linked list could be pointing to somewhere within the existing function's code that destroys the object, and then peels off the next node off the linked list, and jumps to that handler -- or to the exception handler in the function itself. The whole thing gets reduced to, effectively, a bunch of `goto`s, which each chunk of code peeling off the next node from the stack. In any case, these are all implementation details. Very smart people write the compiler for these things, and they know how to generate optimal code. – Sam Varshavchik Jul 30 '22 at 19:15
-
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/246917/discussion-between-waltk-and-sam-varshavchik). – WaltK Jul 30 '22 at 19:16