5

I'm using XCode to develop in C++ on OS X Mountain Lion, to run on the local machine. I'm having performance issues relating to denormal numbers and I wish to set the FTZ flag so that they will be flushed to zero. (I have checked that denormals are indeed the problem, and flushing them to zero will not cause accuracy issues in my case.) However, I can't find any information about how I should actually achieve this in XCode. Is it an option I can change in the build settings? Or some code I should type somewhere? Any help would be much appreciated.

N. Virgo
  • 7,970
  • 11
  • 44
  • 65
  • 1
    Does XCode support SSE intrinsics? If so then [my solution from here](http://stackoverflow.com/questions/9314534/why-does-changing-0-1f-to-0-slow-down-performance-by-10x) should be relevant. That is, add `_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);` to the start of the program. – Mysticial May 25 '13 at 08:28
  • 1
    @Mysticial: I could not find the definitions of _MM_SET_FLUSH_ZERO_MODE and _MM_FLUSH_ZERO_ON. Can you tell me where these come from? – Martin R May 25 '13 at 09:36
  • 1
    It should be in ``. It's a compiler extension that exists in nearly all mainstream C and C++ compilers. I'm not sure about XCode though. – Mysticial May 25 '13 at 15:40
  • 2
    @Mysticial: Thanks! `` is available in Xcode, and both `_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON)` and `fesetenv(FE_DFL_DISABLE_SSE_DENORMS_ENV);` seem to work. – Martin R May 26 '13 at 05:51

1 Answers1

6

If I understand the comments in "/usr/include/fenv.h" correctly,

#include <fenv.h>
fesetenv(FE_DFL_DISABLE_SSE_DENORMS_ENV);

should do what you want.

    FE_DFL_DISABLE_SSE_DENORMS_ENV

    A pointer to a fenv_t object with the default floating-point state modifed
    to set the DAZ and FZ bits in the SSE status/control register.  When using
    this environment, denormals encountered by SSE based calculation (which
    normally should be all single and double precision scalar floating point
    calculations, and all SSE/SSE2/SSE3 computation) will be treated as zero.
    Calculation results that are denormals will also be truncated to zero.

Setting this option reduced the running time of the program in Why does changing 0.1f to 0 slow down performance by 10x? (link given by @Mysticial in his comment) from 27 seconds to 0.3 seconds (MacBook Pro, 2.5 GHz Intel Core 2 Duo).

Community
  • 1
  • 1
Martin R
  • 529,903
  • 94
  • 1,240
  • 1,382
  • I get an error "Unknown type name 'fesetenv'". I'm not sure why, because the declaration `extern int fesetenv(const fenv_t * /* envp */);` is in fenv.h. I'm probably just doing something stupid - do you have any idea what it might be? – N. Virgo May 25 '13 at 13:08
  • @Nathaniel: That is strange. Are you sure that `` is included? I could compile that for both OS X and iOS without problems. – Martin R May 25 '13 at 13:18
  • Oh gosh, I really was doing something stupid - I'd tried to put the fesetenv() call outside of any code block. This works great, many thanks! – N. Virgo May 26 '13 at 09:23
  • Is there a fallback on iOS since iPhone does not support SSE? – Petrus Theron Sep 02 '18 at 08:39