How are traps generated for floating point exceptions?

Question

I want to know which code and files in the glibc library are responsible for generating traps for floating point exceptions when traps are enabled.

Currently, GCC for RISC-V does not trap floating point exceptions. I am interested in adding this feature. So, I was looking at how this functionality is implemented in GCC for x86.

I am aware that we can trap signals as described in this [question] (Trapping floating-point overflow in C) but I want to know more details about how it works.

I went through files in glibc/math which according to me are in some form responsible for generating traps like

fenv.h
feenablxcpt.c
fegetexpect.c
feupdateenv.c

and many other files starting with fe.

All these files are also present in glibc for RISC-V. I am not able to figure out how glibc for x86 is able to generate traps.

Almost the only 'floating point exception' you get these days is _integer_ divide by zero. The floating point code can and does generate infinities or NaN (not-a-number) values instead — and NaNs of the non-signalling variety. — Jonathan Leffler, Jun 25 '19 at 05:10
@JonathanLeffler I think that is because only trap for divide by zero is enabled by default. We can enable traps for other floating point exceptions using feenableexcept function. — FrackeR011, Jun 25 '19 at 05:16

Basile Starynkevitch · Accepted Answer · 2019-06-25T08:31:56.540

3

These traps are usually generated by the hardware itself, at the instruction set architecture (ISA) level. In particular on x86-64.

I want to know which code and files in the glibc library are responsible for generating traps for floating point exceptions when traps are enabled.

So there are no such file. However, the operating system kernel (notably with signal(7)-s on Linux...) is translating traps to something else.

Please read Operating Systems: Three Easy Pieces for more. And study the x86-64 instruction set in details.

A more familiar example is the integer division by zero. On most hardware, that produces a machine trap (or machine exception), handled by the kernel. On some hardware (IIRC, PowerPC), its gives -1 as a result and sets some bit in a status register. Further machine code could test that bit. I believe that the GCC compiler would, in some cases and with some optimizations disabled, generate such a test after every division. But it is not required to do that.

The C language (read n1570, which practically is the C11 standard) has defined the notion of undefined behavior to handle such situations the most quickly and simply possible. Read Lattner's What every C programmer should know about undefined behavior blog.

Since you mention RISC-V, read about the RISC philosophy of the previous century, and be aware that designing out-of-order and super-scalar processors requires a lot of engineering efforts. My guess is that if you invest as much R&D (that means tens of billions of US$ or €) as Intel -or, to a lesser extent, AMD- did on x86-64 into a RISC-V chip, you could get comparable performance to current x86-64 processors. Notice that SPARC or PowerPC (or perhaps ARM) chips are RISC-like, and their best processors are nearly comparable in performance to Intel chips but got probably ten times less R&D investment than what Intel put in its microprocessors.

edited Jun 25 '19 at 08:31

answered Jun 25 '19 at 04:52

Basile Starynkevitch

223,805
18
296
547

Thanks for the answer. I am aware that some bits are set when any Floating point error like Division by Zero occurs. Using functions defined in fenv.h, we can enable traps. So, whenever any trap occurs, instead of giving any value to the operation like NAN, it breaks from normal flow and trap handling routine is invoked. It would be helpful if you can point the code in glibc or somewhere else which is responsible for generating traps. Or you are saying that this can be implemented only at ISA level. – FrackeR011 Jun 25 '19 at 05:13
1

@FrackeR011: When the CPU executes a division by zero it generates an exception, and the CPU starts executing the kernel's exception handler. The kernel's exception handler figures out who did what and does whatever the kernel does (notifies the process somehow). Then the C library in user-space gets this notification from the kernel and converts it into a "C language signal". Of course other languages do other things (e.g. for Java the virtual machine would convert it into a `throw()` instead). – Brendan Jun 25 '19 at 05:30
@FrackeR011: Note that for some kernels (Linux), "notify the process somehow" is likely to be something that looks very similar to a "C language signal", so the code in the C library is likely to be relatively minimum for that case. – Brendan Jun 25 '19 at 05:33
@FrackeR011: Mostly; there is no code in the C library that generates the exception/trap because the CPU does it by itself regardless of which language the code the CPU is executing was or which OS (if any) is being used; and there is only code to convert the kernel's notification into a "C library signal". – Brendan Jun 25 '19 at 05:38
@Brendan if i want a CPU with RISC-V architecture to generate traps for flaoting point exceptions(RISC-V does not generate traps for floating point exceptions like divide by zero), how should i go about changing it. – FrackeR011 Jun 25 '19 at 05:43
2

@FrackeR011: For RISC-V (where floating point problems cause flags to be set and don't cause an exception) I'd expect that the compiler has to inject a check after every floating point instruction, so there'd still be no code in the library to generate the trap because it'd be generated by the compiler and not the library. Note: I would also be tempted to assume all the compiler generated "floating point flag checks" add up to a massive performance disaster. – Brendan Jun 25 '19 at 06:11
@FrackeR011: Hrm - looking into it a little more, it seems like the designers thought that for embedded systems nobody cares (nobody uses floating point exceptions) so it wasn't worth the hassle; and then they're planning to add an "J extension" for managed languages later on that will probably include a floating point exception mechanism. – Brendan Jun 25 '19 at 06:14
@Brendan: the compiler-generated checks don't necessarily add performance disasters. At least that was the [RISC](https://en.wikipedia.org/wiki/Reduced_instruction_set_computer) philosophy since the previous century. Don't forget that efficient implementation of complex instruction sets -such as x86-64- requires a massive amount of engineering investment for the design of the microprocessor chip. CISC implementation using [out-of-order](https://en.wikipedia.org/wiki/Out-of-order_execution) & [superscalar](https://en.wikipedia.org/wiki/Superscalar_processor) techniques is difficult – Basile Starynkevitch Jun 25 '19 at 07:36
1

@Brendan According to [this post on HackerNews](https://news.ycombinator.com/item?id=17611369) (quoting the RISC-V spec.), "_only a single branch instruction needs to be added to each divide operation, and this branch instruction can be inserted after the divide and should normally be very predictably not taken, adding little runtime overhead_". Presumably the same hold for floating-point "exceptions" (which are noted in flags). – TripeHound Jun 25 '19 at 08:51
@TripeHound: That HackerNews person is only talking about "trap on division by zero". If you use `feenableexcept()` to enable "trap on precision loss" (or any of the others - underflow, overflow, etc) then you'd have to worry about every single floating point operation (except for "unary negation"). – Brendan Jun 25 '19 at 10:26
@Brendan I appreciate that, but while there'd be some inflation of the code-size if a compiler inserted conditional branches after all such operations (or, perhaps, selected ones), the implication of that quote is that the branch-prediction logic should minimise runtime overhead (assuming, I guess, much of the FP code is in loops and you don't have a mass of single-execution code). – TripeHound Jun 25 '19 at 10:41
@TripeHound: Sure - words like "massive", "little", etc are relative and therefore meaningless without a reference point (e.g. "little compared to something large" can be exactly the same size as "massive compared to something tiny"). What I really mean is "infinitely more than zero overhead", but I decided that "massive" sounded more forgiving. – Brendan Jun 25 '19 at 10:50

How are traps generated for floating point exceptions?

1 Answers1