MinGW gcc set fp rounding mode

Question

I'm using gcc compiler, and I want to be able to fast change the sse rounding mode. The following code works if compile it under linux:

#include <xmmintrin.h>
unsigned int _mxcsr_up = _MM_MASK_MASK | _MM_ROUND_UP;
unsigned int _mxcsr_down = _MM_MASK_MASK | _MM_ROUND_DOWN;
unsigned int _mxcsr_n = _MM_MASK_MASK;

void round_nearest_mode() {
    asm (
    "ldmxcsr %0" : : "m" (_mxcsr_n)
    );
}

void round_up_mode() {
    asm (
    "ldmxcsr %0" : : "m" (_mxcsr_up)
    );
}

void round_down_mode() {
        asm (
        "ldmxcsr %0" : : "m" (_mxcsr_down)
        );
}

But when I compile it under windows using MinGW, the rounding mode is not changed. What is the reason?

If you are in a hosted environment, t[ry the standard library instead of inline assembly](http://en.cppreference.com/w/c/numeric/fenv/feround) — StoryTeller - Unslander Monica, Nov 20 '16 at 09:10
@StoryTeller I include but I can not use its content. I think that the condition '#if _GLIBCXX_USE_C99_FENV_TR1' does not satisfied, but I don't know why. — Konstantin Isupov, Nov 20 '16 at 09:55
Build your project as C99. Add `-std=c99` to the compiler invocation (or look for the option in the project options if you use an IDE). — StoryTeller - Unslander Monica, Nov 20 '16 at 10:00
What makes you think the mode is not changed? How are you testing this? — David Wohlferd, Nov 20 '16 at 10:01
@DavidWohlferd: I think he really isn't changing rounding mode, because he's ORing the _MM_MASK_MASK instead of ANDing it. — Peter Cordes, Nov 20 '16 at 10:09
@StoryTeller Unfortunately, it did not help: https://s21.postimg.org/fds92e91j/image.png — Konstantin Isupov, Nov 20 '16 at 10:25
@PeterCordes Rounding mode is changing. Recursive summation of rounded numbers were performed. Results from linux: rounding down = 1.04008e-21, rounding up = 1.04026e-21, exact (multiple-precision) = 1.04017e-21. — Konstantin Isupov, Nov 20 '16 at 10:27
So you are using `fegetround()` to check the rounding setting? I believe it uses the `fnstcw` instruction to get its value in mingw. The value set with the `ldmxcsr` instruction needs to be retrieved via the `stmxcsr` instruction, which (as Peter has mentioned) can be retrieved via `_mm_getcsr()` (and in `fegetenv()` if you check `__unused1`). — David Wohlferd, Nov 20 '16 at 11:49

score 1 · Answer 1 · edited May 23 '17 at 11:59

The same header that provides the _MM_ROUND_UP constants also defines _mm_setcsr(unsigned int i) and _mm_getcsr(void) intrinsic wrappers around the relevant instructions.

You should normally retrieve the old value, OR or ANDN the bit you want to change, then apply the new value. (e.g. mxcsr &= ~SOME_BITS). You won't find many examples that just use LDMXCSR without doing a STMXCSR first.

Oh, I think you're actually doing that part wrong in your code. I haven't looked at how _MM_MASK_MASK is defined, but its name includes the word MASK. You're ORing it with various other constants, instead of ANDing it. You're probably setting the MXCSR to the same value every time, because you're ORing everything with _MM_MASK_MASK, which I assume has all the rounding-mode bits set.

As @StoryTeller points out, you don't need inline asm or intrinsics to change rounding modes, since the four rounding modes provided by x86 hardware match the four defined by fenv.h in C99: (FE_DOWNWARD, FE_TONEAREST (the default), FE_TOWARDZERO, and FE_UPWARD), which you can set with fesetround(FE_DOWNWARD);.

If you want to change rounding modes on the fly and make sure the optimizer doesn't reorder any FP ops to a place where the rounding mode was set differently, you need
#pragma STDC FENV_ACCESS ON, but gcc doesn't support it. See also this gcc bug from 2008 which is still open: Optimization generates incorrect code with -frounding-math option (#pragma STDC FENV_ACCESS not implemented).

Doing it manually with asm volatile still won't prevent CSE from thinking x/y computed earlier is the same value, though, and not recomputing it after the asm statement. Unless you use x or y as a read-write operand for the asm statement that is never actually used. e.g.

asm volatile("" : "+g"(x));  // optimizer must not make any assumptions about x's value.

You could put the LDMXCSR inside that same inline-asm statement, to guarantee that the point where the rounding mode changed is also the point where the compiler treats x as having changed.

MinGW gcc set fp rounding mode

1 Answers1