8

I've scoured the web to no avail.

Is there a way for Xcode and Visual C++ to treat denormalised numbers as 0? I would have thought there's an option in the IDE preferences to turn on this option but can't seem to find it.

I'm doing some cross-platform audio stuff and need to stop certain processors hogging resources.

Cheers

Adam
  • 473
  • 1
  • 5
  • 15

3 Answers3

12

You're looking for a platform-defined way to set FTZ and/or DAZ in the MXCSR register (on x86 with SSE or x86-64); see https://stackoverflow.com/a/2487733/567292

Usually this is called something like _controlfp; Microsoft documentation is at http://msdn.microsoft.com/en-us/library/e9b52ceh.aspx

You can also use the _MM_SET_FLUSH_ZERO_MODE macro: http://msdn.microsoft.com/en-us/library/a8b5ts9s(v=vs.71).aspx - this is probably the most cross-platform portable method.

Community
  • 1
  • 1
ecatmur
  • 152,476
  • 27
  • 293
  • 366
4

For disabling denormals globally I use these 2 macros:

//warning these macros has to be used in the same scope
#define MXCSR_SET_DAZ_AND_FTZ \
int oldMXCSR__ = _mm_getcsr(); /*read the old MXCSR setting */ \
int newMXCSR__ = oldMXCSR__ | 0x8040; /* set DAZ and FZ bits */ \
_mm_setcsr( newMXCSR__ ); /*write the new MXCSR setting to the MXCSR */ 

#define MXCSR_RESET_DAZ_AND_FTZ \
/*restore old MXCSR settings to turn denormals back on if they were on*/ \
_mm_setcsr( oldMXCSR__ ); 

I call the first one at the beginning of the process and the second at the end. Unfortunately this seems to not works well on Windows.

To flush denormals locally I use this

const Float32 k_DENORMAL_DC = 1e-25f;
inline void FlushDenormalToZero(Float32& ioFloat) 
{ 
    ioFloat += k_DENORMAL_DC;
    ioFloat -= k_DENORMAL_DC;    
} 
Kevin MOLCARD
  • 2,168
  • 3
  • 22
  • 35
2

See update (4 Aug 2022 at the end of this entry

To do this, use the Intel Intrinsics macros during program startup. For example:

#include <immintrin.h> 
int main() {
  _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); 
}

In my version of MSVC, this emitted the following assembly code:

    stmxcsr DWORD PTR tv805[rsp]
    mov eax, DWORD PTR tv805[rsp]
    bts eax, 15
    mov DWORD PTR tv807[rsp], eax
    ldmxcsr DWORD PTR tv807[rsp]

MXCSR is the control and status register, and this code is setting bit 15, which turns flush zero mode on.

One thing to note: this only affects denormals resulting from a computation. If you want to also set denormals to zero if they're used as input, you also need to set the DAZ flag (denormals are zero), using the following command:

_MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);

See https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-setting-the-ftz-and-daz-flags for more information.

Also note that you need to set MXCSR for each thread, as the values contained are local to each thread.

Update 4 Aug 2022 I've now had to deal with ARM processors as well. The following is a cross-platform macro that works on ARM and Intel:

#ifndef __ARM_ARCH
extern "C" {
   extern unsigned int _mm_getcsr();
   extern void _mm_setcsr(unsigned int);
}
#define MY_FAST_FLOATS _mm_setcsr(_mm_getcsr() | 0x8040U)
#else
#define MY_FPU_GETCW(fpcr) __asm__ __volatile__("mrs %0, fpcr" : "=r"(fpcr))
#define MY_FPU_SETCW(fpcr) __asm__ __volatile__("msr fpcr, %0" : : "r"(fpcr))
#define MY_FAST_FLOATS                                                                        \
   {                                                                                               \
      uint64_t eE2Hsb4v {}; /* random name to avoid shadowing warnings */                          \
      MY_FPU_GETCW(eE2Hsb4v);                                                                 \
      eE2Hsb4v |= (1 << 24) | (1 << 19); /* FZ flag, FZ16 flag; flush denormals to zero  */        \
      MY_FPU_SETCW(eE2Hsb4v);                                                                 \
   }                                                                                               \
   static_assert(true, "require semi-colon after macro with this assert")
#endif
rsjaffe
  • 5,600
  • 7
  • 27
  • 39