1

I'm trying to clear the floating point divide by zero flag to ignore that exception. I'm expecting that with the flag set (no change from default behavior I believe, and commented out below), my error handler will fire. However, _mm_div_ss doesn't seem to be raising SIGFPE. Any ideas?

#include <stdio.h>
#include <signal.h>
#include <string.h>
#include <xmmintrin.h>

static void sigaction_sfpe(int signal, siginfo_t *si, void *arg)
{
    printf("inside SIGFPE handler\nexit now.");
    exit(1);
}

int main()
{
    struct sigaction sa;

    memset(&sa, 0, sizeof(sa));
    sigemptyset(&sa.sa_mask);
    sa.sa_sigaction = sigaction_sfpe;
    sa.sa_flags = SA_SIGINFO;
    sigaction(SIGFPE, &sa, NULL);

    //_mm_setcsr(0x00001D80); // catch all FPE except divide by zero

    __m128 s1, s2;
    s1 = _mm_set_ps(1.0, 1.0, 1.0, 1.0);
    s2 = _mm_set_ps(0.0, 0.0, 0.0, 0.0);
    _mm_div_ss(s1, s2);

    printf("done (no error).\n");

    return 0;
}

Output from above code:

$ gcc a.c
$ ./a.out 
done (no error).

As you can see, my handler is never reached. Side note: I've tried a couple various compiler flags (-msse3, -march=native) with no change.

gcc (Debian 5.3.1-7) 5.3.1 20160121

Some info from /proc/cpuinfo

model name      : Intel(R) Core(TM) i3 CPU       M 380  @ 2.53GHz
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt lahf_lm arat dtherm tpr_shadow vnmi flexpriority ept vpid
BurnsBA
  • 4,347
  • 27
  • 39
  • 1
    I think you have to unmask the relevant exception in the MXCSR. All FP exceptions are masked by default. I haven't tried writing a program that generates signals from FP exceptions, but I would have guessed that your `_mm_setcsr` would work if that constant is right. For testing, you can actually trigger SIGFPE on x86 Linux with an integer divide by zero, though, as required by POSIX. (http://stackoverflow.com/questions/37262572/on-which-platforms-does-integer-divide-by-zero-trigger-a-floating-point-exceptio) – Peter Cordes Sep 25 '16 at 20:56
  • Even with `_mm_setcsr(0x00001F80)` nothing happens. You're right that I can test SIGFPE with integer divide by zero, but I'm trying to get MXCSR to control whether that happens or not. – BurnsBA Sep 25 '16 at 22:20
  • Can't edit my comment, but `_mm_setcsr(0x00000000)` has the same effect (no signal raised). – BurnsBA Sep 25 '16 at 23:58
  • Did you confirm that the divide instruction is even present in the binary? Anything but `-O0` will optimize it away because you don't use the result. – Peter Cordes Sep 26 '16 at 01:32
  • I tried settings/clearing the flags and compiling with `-O0` but no change – BurnsBA Sep 26 '16 at 01:58

1 Answers1

2

Two things.

First, I misunderstood the documentation. Exceptions need to be unmasked to be caught. Calling _mm_setcsr(0x00001D80); will allow SIGFPE to fire on divide by zero.

Second, gcc was optimizing out my divide instruction even with -O0.

Given source line

_mm_div_ss(s1, s2);

Compiling with gcc -S -O0 -msse2 a.c gives

76     movaps  -24(%ebp), %xmm0
77     movaps  %xmm0, -72(%ebp)
78     movaps  -40(%ebp), %xmm0
79     movaps  %xmm0, -88(%ebp)

a1     subl    $12, %esp        ; renumbered to show insertion below
a2     pushl   $.LC2
a3     call    puts
a4     addl    $16, %esp

While source line

s2 = _mm_div_ss(s1, s2); // add "s2 = "

gives

76     movaps  -24(%ebp), %xmm0
77     movaps  %xmm0, -72(%ebp)
78     movaps  -40(%ebp), %xmm0
79     movaps  %xmm0, -88(%ebp)
       movaps  -72(%ebp), %xmm0
       divss   -88(%ebp), %xmm0
       movaps  %xmm0, -40(%ebp)
a1     subl    $12, %esp
a2     pushl   $.LC2
a3     call    puts
a4     addl    $16, %esp

With those changes, the SIGFPE handler is called according to the divide-by-zero flag in MXCSR.

BurnsBA
  • 4,347
  • 27
  • 39