0

I know how to check if two floating point numbers are almost equal, simple code:

bool compare(double a, double b)
{
    if(fabs(a - b) < (1.0 / 10000000))
        return true;
    else
        return false;
}

But when I have some random data, lets say 9.0 and 9.5, or 9.4 and I want to treat them like equal numbers, how to do this? I mean, they're NOT equal but I can allow for some little error +/- 0.5. Any ideas?

With this error I can treat numbers:

9.1 and 9.0 
3.1 and 3.6
-4.2 and -4.6

as equal

nullpointer
  • 245
  • 3
  • 6
  • 15
  • 6
    what about changing the fraction in the if statement? – elyashiv Dec 16 '13 at 19:09
  • [Here](http://stackoverflow.com/questions/13940316/floating-point-comparison-revisited) and [here](http://stackoverflow.com/questions/16513881/two-general-cs-questions) are related questions. – Eric Postpischil Dec 16 '13 at 19:23
  • 1
    Your sample code shows a tolerance of one ten-millionth but your sample data shows a difference of .5. That is a huge difference. The difference between 3.6 and 3.1 is 16%, which is generally considered a large error, not to be ignored in most situations. Why would you want to consider numbers so far apart to be the same? Answering that question is important to tailoring guidance for your situation. – Eric Postpischil Dec 16 '13 at 19:29
  • 2
    Per a comment from the OP, this appears to actually be a statistical question, not a floating-point question. I suspect the problem is to determine whether two sets of data represent samples from two populations with the same mean, rather than to determine whether the results of two floating-point calculations could be proxies for values that would be equal if calculated with exact mathematics. If so, it may be a math question rather than a programming question, suitable more for [Mathematics Stack Exchange](http://math.stackexchange.com) than for Stack Overflow. – Eric Postpischil Dec 16 '13 at 19:33

2 Answers2

6

It is impossible to know what tolerance to use to accept unequal numbers as equal without knowing what calculation errors can exist in those numbers and what is acceptable for the purpose of the application.

It is possible that a few simple arithmetic operations will produce infinite error, and it is also possible that millions of arithmetic operations will produce a result with no error. Calculating what error may have occurred has to be done individually for a computation; there is no general rule. There is not even a general rule for the type of error that is acceptable: Some calculations result in errors that are proportional to the results (relative errors), some result in errors that are absolute, and some result in errors that are complicated functions of data that might not even be present in the values being examined. So even a routine that compares with relative error given a parameter for the amount of error is insufficient for general use.

Additionally, accepting unequal numbers as equal reduces false negatives (situations where numbers that would have been equal if calculated with exact mathematics are unequal because approximate arithmetic was used) at the expense of increasing false positives (accepting numbers as equal even though they are actually unequal). Some applications can tolerate this. Some cannot.

If you want more guidance, you need to explain further what you are doing and what your goals are.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • I have some mean values calculated by the Simple Average Method on the randomly generated data. Then, I would like to check if my data (calculated means) are stationary or not - so I need to check if my means are constant over time. Having random data its difficult to assume the stationarity with comparing means .. so I need a little error value to see if my means are `constant` – nullpointer Dec 16 '13 at 19:27
  • 2
    @nullpoint: I think what you are getting at is that you have samples of some physical/real-world process, and you want to test whether the mean of some aspect of this process is unchanging. If so, that is a statistical test, not a floating-point test. – Eric Postpischil Dec 16 '13 at 19:31
  • But I want to make a Java program for it, so I need to compare those numbers anyway, right? – nullpointer Dec 16 '13 at 19:35
  • 3
    @nullpointer: You tagged this question with C, not Java. However, the code for Java would be largely the same. Ultimately, your statistical test might boil down to a comparison of the difference of the two numbers (`fabs(a-b)`) to some threshold. However, determining that threshold is a complicated issue. You need to have some mathematical model of the processes, some measure of the variation in the values (perhaps a standard deviation estimated from the samples), and some probability threshold you consider acceptable, and then you would have to calculate a threshold. – Eric Postpischil Dec 16 '13 at 19:40
  • Right, C not Java, sorry. Huh so its not that simple. I can calculate std dev but how to calculate this acceptable error? – nullpointer Dec 16 '13 at 19:51
  • @nullpointer: That is a statistical question. You might get an answer at [Mathematics Stack Exchange](http://math.stackexchange.com/). You will need some description of what the data is. (Stock market prices? Service response times?) The question you are seeking to solve might be how to test the hypothesis that two distributions have the same means given samples drawn from each. But you have not described the problem well enough for me to be sure. – Eric Postpischil Dec 16 '13 at 20:11
5

If I understood you right, this code will do:

bool compare(double a, double b, double precision)
{
    if(fabs(a - b) < precision)
        return true;
    else
        return false;
}
elyashiv
  • 3,623
  • 2
  • 29
  • 52