3

I am trying to debug the problem I posted earlier here: C++ and pin tool -- very weird DOUBLE variable issue with IF statement. I tracked down the moment when the weird behavior occurred using gdb. What I found is shown in the figure below that shows the gdb screenshot displaying the disassembled code and floating pointer register values. (larger image here) before and after FLDZ instruction is executed Left-hand side image shows the screenshot before the highlighted FLDZ instruction is executed and the right-hand side image is after the instructions is executed. I looked up the x86 ISA and FLDZ is for loading +0.0 into ST(0). However, what I get is -nan instead of +0.0. Does anybody know why this happens? The system I am using is Intel xeon 5645 running 64-bit CentOS, but the target program I am trying to debug is 32-bit application. Also, as I mentioned in the earlier post, I tried two versions of gcc, 4.2.4 and 4.1.2 and observed the same problem. Thanks.

--added-- By the way, below is the source code.

void Router::Evaluate( )
{
  if (_id == 0) aaa++;

  if ( _partial_internal_cycles != 0 )
  {
    aaa += 12345;
    cout << "this is not a zero : " << endl;
    on = true;
  }

  _partial_internal_cycles += (double) 1.0;

  if ( _partial_internal_cycles >= (double)1.0 ) {
    _InternalStep( );
    _partial_internal_cycles -= (double)1.0;
  }

  if (GetSimTime() > 8646000 && _id == 0) cout << "aaa = " << aaa << endl;
  if ( on)
  {
    cout << "break. id = " << _id << endl;
    assert(false);
  }

}
Community
  • 1
  • 1
ray
  • 51
  • 4
  • Looks like an FPU stack overflow. Do you have source code at hand? – David Heffernan Aug 08 '12 at 08:46
  • Looks like the error is raised at `if ( _partial_internal_cycles != 0 )`. Your FPU stack is full at that point. You need to somehow understand how that can be. The compilers that I am familiar with will empty the FPU stack once it has finished a calculation. Which compiler are you using? – David Heffernan Aug 08 '12 at 08:58
  • I agree with you about where the error is raised. Also, I used gcc 4.2.4. – ray Aug 08 '12 at 09:04
  • 2
    I don't use gcc, but I'm pretty sure that its policy must be to empty the FPU stack when it has finished with it. The x86 ABI surely calls for FPU stack to be empty when function calls are made. So I think your challenge is working out how the stack can be non-empty when this method is called. The problem is not in the code that you have posted, but somewhere else in your program. Could be anywhere! – David Heffernan Aug 08 '12 at 09:08
  • Very high odds that this a pin bug. Sucks when you have to debug the processor. Use the Yahoo group to find help. – Hans Passant Aug 08 '12 at 09:25
  • I inserted "emms" instruction before the function call and the problem was solved. More information at the [pinheads](http://tech.groups.yahoo.com/group/pinheads/message/3190) – ray Sep 21 '12 at 01:42

1 Answers1

4

An exception was generated (notice the I bit is set in the stat field). As the documentation says:

If the ST(7) data register which would become the new ST(0) is not empty, both a Stack Fault and an Invalid operation exceptions are detected, setting both flags in the Status Word. The TOP register pointer in the Status Word would still be decremented and the new value in ST(0) would be the INDEFINITE NAN.

By the way, your underlying issue is because this is just the nature of floating point. It's not exact. See, for example, this gcc bug report -- and this one.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • "By the way, your underlying issue is because this is just the nature of floating point. It's not exact." That is true but not pertinent to the other question. – David Heffernan Aug 08 '12 at 08:54
  • It is. The result of the comparison depends on the precision. – David Schwartz Aug 08 '12 at 09:10
  • `x==0` returns the same value every time it is evaluated, so long as `x` is not changed. Floating point arithmetic is indeed inexact, but there's no arithmetic in that question. – David Heffernan Aug 08 '12 at 09:12
  • @DavidHeffernan: It may require loading the value of `x` into a register from memory. If the register and the memory are in different formats, then a format conversion is required to complete the load. That is an arithmetic operation. The compiler could also write `x` into memory from a register so that it can compare it to a floating point zero that is already in memory. That could require a format conversion too. If the standard requires the comparison to return true, then it will return true. If it requires it to return false, it will return false. Otherwise, it need not even be consistent. – David Schwartz Aug 08 '12 at 12:09
  • If a conversion is needed, the same conversion will be performed each time the value is loaded into the register. I think this is a side issue. Real problem is clearly the register stack overflow. – David Heffernan Aug 08 '12 at 12:27
  • @DavidHeffernan: Sure, but it may not get loaded into a register twice. The first comparison might load it into a register. The second might compare it to a zero that's already in memory. Or, very commonly, the first operation may occur in the register and then intervening code between the two operations may force the value into memory and then back, causing a precision loss. – David Schwartz Aug 08 '12 at 12:29
  • That's not actually what's happening though is it? It's just an FPU register stack overflow. Why complicate things with something irrelevant? Or do you thing the fundamental problem is other than register stack overflow? – David Heffernan Aug 08 '12 at 13:03
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/15093/discussion-between-david-schwartz-and-david-heffernan) – David Schwartz Aug 08 '12 at 21:05