Why does it require 20X more time for these calculations when the values get tiny

Question

I am a circuit designer, not a software engineer, so I have no idea how to track down this problem.

I am working with some IIR filter code and I am have problems with extremely slow execution times when I process extremely small values through the filter. To find the problem, I wrote this test code.

Normally, the loop will run in about 200 ms or so. (I didn't measure it.) But when TestCheckBox->Checked, it requires about 7 seconds to run. The problem lies with the reduction in size of A, B, C and D within the loop, which is exactly what happens to the values in an IIR filter after it's input goes to zero.

I believe the problem lies with the fact that the variable's expononent value becomes less than -308. A simple fix is to declare the variables as long doubles, but that isn't an easy fix in the actual code, and it doesn't seem like I should have to do this.

Any ideas why this happens and what a simple fix might be?

In case its matters, I am using C++ Builder XE3.

int j;
double A, B, C, D, E, F, G, H;
//long double A, B, C, D, E, F, G, H; // a fix
A = (double)random(100000000)/10000000.0 - 5.0;
B = (double)random(100000000)/10000000.0 - 5.0;
C = (double)random(100000000)/10000000.0 - 5.0;
D = (double)random(100000000)/10000000.0 - 5.0;
if(TestCheckBox->Checked)
 {
  A *= 1.0E-300;
  B *= 1.0E-300;
  C *= 1.0E-300;
  D *= 1.0E-300;
 }
for(j=0; j<=1000000; j++)
{
 A *= 0.9999;
 B *= 0.9999;
 C *= 0.9999;
 D *= 0.9999;
 E = A * B + C - D; // some exercise code
 F = A - C * B + D;
 G = A + B + C + D;
 H = A * C - B + G;
 E = A * B + C - D;
 F = A - C * B + D;
 G = A + B + C + D;
 H = A * C - B + G;
 E = A * B + C - D;
 F = A - C * B + D;
 G = A + B + C + D;
 H = A * C - B + G;
}

EDIT: As the answers said, the cause of this problem is denormal math, something I had never heard of. Wikipedia has a pretty nice description of it as does the MSDN article given by Sneftel.

http://en.wikipedia.org/wiki/Denormal_number

Having said this, I still can't get my code to flush denormals. The MSDN article says to do this:

_controlfp(_DN_FLUSH, _MCW_DN)

These definitions are not in the XE3 math libraries however, so I used

controlfp(0x01000000, 0x03000000)

per the article, but this is having no affect in XE3. Nor is the code suggested in the Wikipedia article.

Any suggestions?

The answers to [this question](http://stackoverflow.com/questions/2487653) might be helpful. — Mike Seymour, Oct 31 '14 at 14:33
Numbers don't take longer to process when they become smaller. My theory is that you start triggering floating point exceptions (underflows) and processing a few millions of those may indeed take a long time. Don't compute with numbers around 1e-300, they make no physical sense. Diameter of the Universe in Planck units is about 1e62. — n. m. could be an AI, Oct 31 '14 at 14:41
Are you actually using an Intel Pentium processor? If not, I don't think the tag is relevant. — crashmstr, Oct 31 '14 at 14:43

Sneftel · Answer 1 · 2014-10-31T14:40:30.173

11

You're running into denormal numbers (ones less than DBL_MIN, in which the most significant digit is treated as a zero). Denormals extend the range of the representable floating-point numbers, and are important to maintain certain useful error bounds in FP arithmetic, but operating on them is far slower than operating on normal FP numbers. They also have lower precision. So you should try to keep all your numbers (both intermediate and final quantities) greater than DBL_MIN.

In order to increase performance, you can force denormals to be flushed to zero by calling _controlfp(_DN_FLUSH, _MCW_DN) (or, depending on OS and compiler, a similar function). http://msdn.microsoft.com/en-us/library/e9b52ceh.aspx

edited Oct 31 '14 at 14:40

answered Oct 31 '14 at 14:29

Sneftel

40,271
12
71
104

Is there a simple way to let the variables go to zero rather than < BDL_MIN? – user5108_Dan Oct 31 '14 at 14:35
@user5108_Dan [This question](http://stackoverflow.com/questions/2487653/avoiding-denormal-values-in-c) has some answers. – nwellnhof Oct 31 '14 at 14:37
@user5108_Dan Yes, described above. – Sneftel Oct 31 '14 at 14:41
@Sneftel From what I've read, _controlfp(_DN_FLUSH, _MCW_DN) should be the answer, but I am not having any luck with it. I am having some trouble finding the correct constant definitions in the help files, so I am using _controlfp(0x01000000, 0x03000000); as shown in the MSDN link you provided, but they don't work. – user5108_Dan Oct 31 '14 at 15:40

score 4 · Answer 2 · answered Oct 31 '14 at 14:28

4

You've entered the realm of floating-point underflow, resulting in denormalized numbers - depending on the hardware you're likely trapping into software, which will be much much slower than hardware operations.

answered Oct 31 '14 at 14:28

sfjac

7,119
5
45
69

I am running my own code on a Pentium. Wouldn't I see an underflow error. Why don't the variables simply go to zero? – user5108_Dan Oct 31 '14 at 14:31
@user5108_Dan Underflow exceptions (and other FP exceptions) are commonly masked out by default. And they're not going to zero because underflow can generate denormal results, not just zero. – Sneftel Oct 31 '14 at 14:33
1

No, this happens silently in IEEE arithmetic unless you take special care (or use special compile options) to cause different behavior. On some systems you can intercept these traps but I think that's system-dependent. This can be a real headache for performance if it happens a lot (as can NANs, and other IEEE features) but in general it improves accuracy of the calculations. – sfjac Oct 31 '14 at 14:33

Why does it require 20X more time for these calculations when the values get tiny

2 Answers2