0

I am developing a big code base and wanted to use set-s from the STL to store double elements. I just ran some tests and found something very weird.

Here's my source code

#include <iostream>
#include <set>
#include <numeric>
#include <algorithm>

int main()
{
    double iter = 0.001;
    int count = 0;
    std::set<double> S;

    while(count < 10) 
    {
        S.insert(iter);
        std::cout << "Inserted element is " << iter << std::endl;
        iter += 0.001;
        count++;
    }

    for (std::set<double>::iterator i = S.begin(); i != S.end(); i++) 
    {
       double element = *i;
       std::cout << "The element in the set is " << element << std::endl;
    }

    std::cout << "The count of the element to be found is " << S.count(0.009) << std::endl;

    if (S.find(0.008) != S.end())
        std::cout << "set::find found it!" << std::endl;
    else 
        std::cout << "set::find didn't find it!" << std::endl;  

    if (std::find(S.begin(), S.end(), 0.009) != S.end())
        std::cout << "std::find found it!" << std::endl;
    else
        std::cout << "std::find didn't find it!" << std::endl;

    return 0;
}

Now, the weird part is that both set::find and std::find are able to find all elements until 0.008 but cannot find anything after that.

The output I get for the above code is shown below:

Inserted element is 0.001
Inserted element is 0.002
Inserted element is 0.003
Inserted element is 0.004
Inserted element is 0.005
Inserted element is 0.006
Inserted element is 0.007
Inserted element is 0.008
Inserted element is 0.009
Inserted element is 0.01
The element in the set is 0.001
The element in the set is 0.002
The element in the set is 0.003
The element in the set is 0.004
The element in the set is 0.005
The element in the set is 0.006
The element in the set is 0.007
The element in the set is 0.008
The element in the set is 0.009
The element in the set is 0.01
The count of the element to be found is 0
set::find found it!
std::find didn't find it!

Notice that I have asked set::find to find 0.008 while I have asked both count and std::find to count and find 0.009 respectively. I just wanted to know why I'm seeing such an anomaly.

It is also worth mentioning that both the find functions have absolutely no problems finding integer elements. I was wondering if this is a double issue. Even if that's the case, I'm not sure why it is able to find a few elements but not all.

I'm not sure what causes this error but if it's of any use I work on Mac OS X 10.10.3 with gcc 4.9.2.

  • 1
    [Read this](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html). (tl;dr: floating point representation is fuzzy) – M.M Jun 16 '15 at 06:02
  • For all intents and purposes it's futile to use `std::set` – M.M Jun 16 '15 at 06:04
  • @MattMcNabb: not so... just most... but anyone who needs to ask about problems on here won't know which intents and purposes are reasonable, so they'd best work to your "all" guideline. – Tony Delroy Jun 16 '15 at 06:07

1 Answers1

4

Yes it's a double "issue" - when you successively add 0.001 to iter, rounding errors can accumulate such that the resultant value doesn't match what you might expect (e.g. your hardcoded 0.008).

Because set is sorted, you may want to use e.g. lower_bound to find a nearby element, then check using some tolerance for differences. For example, checking that the absolute difference between the found and expected values is less than an epsilon amount. There are hundreds of questions on S.O. about how exactly to implement the comparison... e.g. here.

Another issue specific to the use of floating point numbers in set is that you may not get inserts deemed duplicates when you expect them too, or on the other hand insert may refuse to insert two values that you didn't expect to be duplicates, as the rounding errors during their generation has left them identical. It's also possible for the order of two extremely similar values to be reversed compared to what you'd mathematically expect given perfectly accurate calculations.

Community
  • 1
  • 1
Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
  • Since my final code needs tolerance, I would anyway be implementing it. But is there a reason it works for the first few numbers? – sriramkswamy Jun 16 '15 at 06:14
  • @sriramkswamy: luck? sometimes the value can be as expected, other times there can be a small error, the next calculation with that slightly erroneous result will occasionally produce another "as expected" result - it's all hard to reason about. Just have to accept it's not reliable and decide on some strategies for handling it. There are many (e.g. storing an integral number of millionths can sometimes be best, there are arbitrary precision libraries, "decimal" classes that are exact for numbers that can be represented in decimal, albeit sometimes to a set number of significant digits). – Tony Delroy Jun 16 '15 at 06:28