0

Hi I have a function defined as

int compareAB(float A, float B)  
{  
    if(A > B) 
    {  
        return 1;  
    }
    else if(A < B)
    {
        return -1;  
    }  
    else
    {
        return 0;
    }
}

Above function is taking too much time for comparing in my project as I can see from performance tools. Can I improve its run time.

One way I think is by taking difference of 2 numbers and then comparing it with zero.

Thanks,

EDIT: This function is being used in sorting comparison function for maintaining a Search tree data structure. Also corrected return values now.

ravi
  • 3,304
  • 6
  • 25
  • 27
  • That function is probably faster the everything else. The reason for it to be slow is if the floats often are NaN or infinity since in these cases drive up the runtime significantly. – Xale May 28 '13 at 10:05
  • 1
    From 2000 sec total time of my project, micro-instruction inside this function is taking 38 sec. And it is not NaN or infinity ever. – ravi May 28 '13 at 10:10
  • If it's taking too long, don't do it. Change the upper level algorithms – Yann Droneaud May 28 '13 at 10:19
  • 1
    2000 seconds of CPU time on a modern processor is some 4000-6000 billion instructions in total. 38 seconds is roughly 5% of the time. What else is going on in your code? Is it possible that the "high measurement" is simply that this instruction is "where it all comes together", in other words, the processor has to "stop here to wait for the results of something else"? If so, maybe it's the PREVIOUS (set of) instruction(s) that cause the holdup... – Mats Petersson May 28 '13 at 12:06
  • Do you specifically need the result of +/-1 rather than any positive/negative value? For example, `qsort` requires the latter, so you could just return `a-b`. And if this is for `qsort` (and the C++ tag isn't just for decoration) then consider `std::sort` instead; it can avoid using function pointers, giving large speed benefits. – Mike Seymour May 28 '13 at 12:08
  • @Xale Nothing is easier than comparing NaN or infinities. If comparison wasn't a single-cycle instruction already, it probably would have fast exits for the cases when one of the arguments is NaN or infinity. You might be thinking of denormals. Those aren't difficult to compare, but there may be difficult to compute with, one of the reasons the idea of computing `A-B` and comparing the result to zero is not an improvement. – Pascal Cuoq May 28 '13 at 12:09
  • 1
    It is unlikely much can be done to improve a simple function like this in isolation. If you show its context within the large program, more enlightened and enlightening comments may be possible. Little meaningful aid can be given with the information provided so far. – Eric Postpischil May 28 '13 at 13:09
  • As Mats Petersson asked, what else is going on in your code? All you have mentioned so far is sorting data. It is not surprising that a sort spends some time doing comparisons. All it has to do is compare, move data, and keep track of what it is doing. If your program is doing a lot of sorting and not much else, then spending 5% of its time on comparisons is normal. So, is your program doing anything else? Is it doing a lot of sorts it does not need to? You should not need to sort to **maintain** a tree. Normally, you just do incremental insertions; you do not sort it all every time. – Eric Postpischil May 28 '13 at 14:26

3 Answers3

5

This looks like a contrived way of attempting to circumvent the "should never compare equality on floating point" rule. Comparing inequality is not very much different to comparing equality as you are implicity relying on floating point precision in both cases. Your final 'else' statement is an implicit A == B.

The normal idiom is if (::fabs(A - B) < e) where e is some tolerance, although in your case you don't need the ::fabs.

If you want different results for positive, negative and equality (within limits of computational precision), then do something like

if (A - B > e){
    return 0;
} else if (A - B < -e){
    return 1;
} else {
    return -1;
}

The best you can hope for is setting e to std::numeric_limits<double>::epsilon(). The actual value depends on the number of computational steps executed in order to arrive at A and B. 1e-08 is probably realistic.

As for speed, it is what it is unfortunately: I can't see this being either the bottleneck or running any faster.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • 1
    How do you manage to bring up the so-called “rule” you cite in a question about the efficiency of floating-point comparison? And as pointed out by EJP, this `compare` function is probably used for sorting. Just try sorting an array of enough `doubles` with the so-called “idiom” of comparing up to `1e-08`. (Note that if you use `std::numeric_limits::epsilon()` for `e`, it may appear to work for values of magnitude larger than 2, but that's only because you are misusing `std::numeric_limits::epsilon()` in addition) – Pascal Cuoq May 28 '13 at 11:13
  • Pascal: Just because the function name looks like it could be a standard sorting compare doesn't imply that to me: for starters the return values are incorrect and the designers of sorting algorithms are careful to not require equality. Feel free to post a precise answer referring to strong / weak ordering in sorting callbacks and I'll upvote it. – Bathsheba May 28 '13 at 11:29
  • 6
    This answer is rubbish because: (a) Advice against comparing for equality in floating-point is not a rule, just oft-repeated bad advice. (b) This code does not look like a contrived way of circumventing the advice. (c) The comparison of fabs(A-B) to a tolerance may be common, but it is not normal. (d) There is no foundation for using epsilon as a tolerance. Numbers above two will never have such a difference, and numbers much smaller may have much smaller differences. – Eric Postpischil May 28 '13 at 13:02
  • 1
    (e) The errors in the results of floating-point calculations are critically dependent on the structure, type, and data of previous calculations. No meaningful tolerance can be derived simply from the number of them. Errors can range from zero to infinity (or NaN), so suggesting any single value, without context, is useless. (f) The nature of the errors may be absolute, relative to the obtained result, or relative to other values, or more complex, so suggesting an absolute tolerance without context is useless. – Eric Postpischil May 28 '13 at 13:05
  • 3
    (g) Using a tolerance for comparison changes some results into other results (e.g., situations that would have returned 0 into situations that return -1). However, without knowledge of the application, it is impossible to state whether this is an improvement or a detriment. This change might convert useful results into incorrect and harmful results. – Eric Postpischil May 28 '13 at 13:06
  • 1
    Eric Postpischil: please enlighten us with a full answer. – Bathsheba May 28 '13 at 13:29
  • @EricPostpischil your comment (a) seems so wrong that it's like you're never ever wrote a program using float-point computation on Intel FPU (x87). But comment (d) is wise enough to makes me look at your user page. Comparing for equality of floating-point values using operator "==" is the sure path to (un)predictable errors. Values will never match thanks to excess-precision computation done inside x87 FPU, computation ordering by the compiler (and optimization enabled). – Yann Droneaud May 28 '13 at 13:44
  • Sure, everybody read http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ (and get lost :) – Yann Droneaud May 28 '13 at 13:48
  • 1
    It makes perfectly sense to test floating points for equality in many cases. The typical counterexample of `for (float f=0f; f != 1.0f; f += 0.1f);` doesn't contradict that at all. – Aki Suihkonen May 28 '13 at 13:49
  • 4
    @Bathsheba: No, I will not provide an inappropriate answer. As I wrote in a comment to the question, there is insufficient information to provide useful advice, other than to say that there is little that can be done to speed up the function. The proper way to proceed at this point is for the questioner to provide more information. – Eric Postpischil May 28 '13 at 13:54
  • @ydroneaud: Actually, excess precision and operation ordering are not contributors to avoiding `==` because `==` is **already** made problematic by **normal** floating-point operations. Rarely does any code want to perform the same computation twice and compare the results for exact equality. What you are often trying to do is compute some result and see if it exceeds some threshold (e.g., is it too heavy for some structure to hold, to high a pressure for some container, is there a signal in the noise). The fact that there are any calculation errors **at all** makes this problematic.… – Eric Postpischil May 28 '13 at 14:02
  • 1
    … So, even if you use strict floating-point semantics, no extra precision, there are problems. But we have to step back and ask what the problem is and what do we do about it. And `==` is **not** the problem. The `==` operator **always** returns a mathematically correct result. The actual problem is in earlier operations, which have produced incorrect results. Suppose you have `x` and `y` which have been computed with floating-point and which are not the mathematically correct results. How can you tell if the mathematically correct values, *x* and *y*, are equal? You cannot; it is impossible.… – Eric Postpischil May 28 '13 at 14:04
  • … `x` and `y` do not contain the information about whether *x* and *y* are equal. And comparing with tolerances cannot fix this. But it is not just `==` that is the problem. Suppose you calculate `sqrt(x)` or `acos(x)`, but the error in `x` makes it slightly less than zero or greater than one when it should not be. Then these functions return incorrect results. So we see, it is not `==` that is the problem; **you cannot correctly compute any function from incorrect data** (with certain trivial exceptions). The actual solution is that you must step back, examine the entire calculation,… – Eric Postpischil May 28 '13 at 14:07
  • 1
    … figure out the errors that can occur, and design the entire program so that desired results are obtained. So, when we are given an isolated function like this, we **do not have the necessary information** to know what its intent is, what problems there are with using it, or how to fix it. The proper procedure is to refrain from giving advice about it and to ask for more information. – Eric Postpischil May 28 '13 at 14:08
  • @EricPostpischil I agree with you, but your comment (a) seems more and more unclear. You probably want to mean that avoiding comparison for equality is not the definitive solution to floating point computation misunderstanding ? – Yann Droneaud May 28 '13 at 14:10
  • @ydroneaud: Yes, (a) could use some clarification. Generally, the advice is bad because: (1) It makes people think there is something wrong with the compare-for-equality operation, when in fact that is one of the few perfect floating-point operations. (2) It is usually tied to bad code for using a tolerance that (a) decreases false negatives at the expense of increasing false positives without any knowledge of the trade-offs of these changes in the application and (b) uses a set absolute or relative tolerance without any knowledge of the errors. (3) It fails to inform that **all** functions… – Eric Postpischil May 28 '13 at 14:16
  • 3
    The *big* problem with this solution is that it breaks sorting. If `a == b` and `b == c`, any sane sorting algorithm would conclude that `a == c`. However, when using the proposed function, this is not true when `a` and `b`, and `b` and `c` differs with less than epsilon, respectively, whereas `a` and `c` does not. – Lindydancer May 28 '13 at 14:18
  • … generally produce incorrect outputs when given incorrect inputs, except for special cases such as trivial functions or locations where the derivative is zero. That is, it tars and feathers `==` but fails to advise that `acos`, `sqrt`, `log`, `+`, `*`, and other functions will also give incorrect results when you give them inputs containing errors. – Eric Postpischil May 28 '13 at 14:18
  • 3
    @ydroneuad: And, now that information has been added to the question, we see that this is in fact one of the situations where `==` is perfectly fine. Sorting numbers should be done with unadorned comparisons (no tolerance). Using a tolerance can break sort routines. – Eric Postpischil May 28 '13 at 14:19
  • I'm actually looking for an efficient implementation for a comparison function suitable for sorting (i.e., with [qsort](http://pubs.opengroup.org/onlinepubs/009695399/functions/qsort.html)). I don't need an equality tolerance for my application. Is @ravi's example the most efficient way to compare two doubles on Intel hardware? – Leo Aug 04 '14 at 18:10
0

One of the problems here is that the return values are completely incorrect. The first return should be 1, the second should be -1, and the third should be zero. If you're using this for sorting, the sort is probably unstable, which is increasing its run time.

In principle you can just return a-b, unless you are going to be dealing with NaNs.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • It may be the case that the program uses a home-grown sort algorithm that expects the values the quoted function returns. However, that may mean there is opportunity for reducing the number of comparisons. – Patricia Shanahan May 28 '13 at 13:31
0

Floating-point compares can be more expensive than plain integer compares, especially if you have no dedicated floating-point hardware.

Fortunately, you can utilize integer comparison when comparing floating-point numbers, under some cases:

1) This only works when using the IEEE754 floating-point format. 2) It does not work for NaN:s.

Accessing the underlying representation is underfined behaviour, as the C language does not specify which floating-point format it uses.

Anyway, the trick is that it only works as long as the floating-point numbers have the same sign. And, in that case, comparing the integer representation of two negative floating-point number is the inverse of comparing the floating-point numbers themselves.

I haven't performance measured the code below, but chances are that it is faster than your original code. Please let me know the performance gain, if any!

int compareAB(float a_orig, float b_orig)
{
  /* Get the underlying representation of A and B. */
  long a = *(unsigned long *)&a_orig;
  long b = *(unsigned long *)&b_orig;

  if (a < 0)
  {
    if (b < 0)
    {
      /* Both are negative. The order of the integer representation is
       * the OPPOSITE of the order of the floating-point value. */
      if (a > b)
      {
        return -1;
      }
      else if (a < b)
      {
        return 1;
      }
      else
      {
        return 0;
      }
    }
    else
    {
      /* A is negative, B isn't => A is smaller. */
      return -1;
    }
  }
  else if (b < 0)
  {
    /* B is negative, A isn't => B is smaller. */
    return 1;
  }
  else
  {
    /* Both are positive. */
    if (a > b)
    {
      return 1;
    }
    else if (a < b)
    {
      return -1;
    }
    else
    {
      return 0;
    }
  }
}

You can test this with:

#include <stdio.h>

float values[] = {-100.0F,
                  -50.0F,
                  0.0F,
                  50.0F,
                  100.0F };

void test(float a, float b)
{
  const char * p = 0;
  printf("%f is ", a);

  switch (compareAB(a, b))
  {
  case -1: p = "smaller than"; break;
  case  0: p = "equal to"; break;
  case  1: p = "greater than"; break;
  }

  printf("%s %f\n", p, b);
}

int main(void)
{
  int i;

  for (i = 0; i < sizeof(values)/sizeof(values[0]); ++i)
  {
    int j;
    for (j = 0; j < sizeof(values)/sizeof(values[0]); ++j)
    {
      test(values[i], values[j]);
    }
  }
}

It gives the same output as when using your original code, namely:

-100.000000 is equal to -100.000000
-100.000000 is smaller than -50.000000
-100.000000 is smaller than 0.000000
-100.000000 is smaller than 50.000000
-100.000000 is smaller than 100.000000
-50.000000 is greater than -100.000000
-50.000000 is equal to -50.000000
-50.000000 is smaller than 0.000000
-50.000000 is smaller than 50.000000
-50.000000 is smaller than 100.000000
0.000000 is greater than -100.000000
0.000000 is greater than -50.000000
0.000000 is equal to 0.000000
0.000000 is smaller than 50.000000
0.000000 is smaller than 100.000000
50.000000 is greater than -100.000000
50.000000 is greater than -50.000000
50.000000 is greater than 0.000000
50.000000 is equal to 50.000000
50.000000 is smaller than 100.000000
100.000000 is greater than -100.000000
100.000000 is greater than -50.000000
100.000000 is greater than 0.000000
100.000000 is greater than 50.000000
100.000000 is equal to 100.000000
Lindydancer
  • 25,428
  • 4
  • 49
  • 68
  • The representation of floating-point numbers is what one would call “implementation-defined”. Undefined behavior would allow the compiler to make demons fly out your nose if you accessed it. The compiler is not, in fact, allowed to do that. On the other hand, `*(unsigned long *)&a_orig` **is** undefined behavior and allows the compiler to make demons fly out of your nose, because it breaks “strict aliasing”: http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule . `memcpy()` should be used, or a union as allowed by C99TC3 (see footnote 82) or C11. – Pascal Cuoq May 28 '13 at 14:26
  • Also, the OP refers to `ucomiss` in a comment. Your function is definitely not faster than the OP's original code compiled to a couple of `ucomiss` instructions. – Pascal Cuoq May 28 '13 at 14:29