Is there an equivalent to C++ comparator objects for BTreeMap and other things that rely on Ord?

Question

I can't seem to accomplish the same things with the Ord trait that I was able to with C++ comparator objects. Take this simplification of more complicated scenarios for example:

int epsilon = 0.001;
auto less = [epsilon](const double a, const double b) {
                if (fabs(a - b) > epsilon) {
                    return a < b;
                }
                else {
                    return false;
                }
            };
std::map<float, int, decltype(less)> myMap(less);
myMap.insert(std::make_pair(1.0, 1));
myMap.insert(std::make_pair(2.0, 2));
myMap.insert(std::make_pair(3.0, 3));

auto found = myMap.find(1.0001);
assert(found != myMap.end());
assert(found->second == 1);

I want a comparator to have some runtime context, like an epsilon value. I can't figure out how you would do this with BTreeMap since I can't add state to the Ord trait. Is there some technique I haven't figured out yet to do the equivalent?

EDIT

I had a typo in my example that prevented me from realizing the C++ didn't actually work. I'm reevaluating the problem.

`int epsilon = 0.001;` => if you provide snipped please read [mcve]. — Stargateur, Sep 25 '18 at 06:46
(side remark) Note that your `less` does not define a valid order for use in `std::map`, according to the C++ standard. — Marc Glisse, Sep 25 '18 at 06:58
Possible duplicate of [How can I use a HashMap with f64 as key in Rust?](https://stackoverflow.com/questions/39638363/how-can-i-use-a-hashmap-with-f64-as-key-in-rust) — hellow, Sep 25 '18 at 07:11
Your question is not about finding an element in a hashmap/btree (that's easy, just use [`find`](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.find) on the iterator), but instead how to use a float/double as a key in a hashmap. — hellow, Sep 25 '18 at 07:11
@hellow except that `find` on the iterator uses linear time, whereas a binary tree allows for logarithmic time. — Jmb, Sep 25 '18 at 07:21
That's right Jmb, I think that should be improved. I opened an [issue on Github](https://github.com/rust-lang/rust/issues/54550), thanks! — hellow, Sep 25 '18 at 07:34
Unfortunately, your C++ code is flawed: `less` is not transitive. For example, [if you follow this link](http://cpp.sh/2mq5e) you'll see I sort a vector `{ 2.0, 1.0, 0.0 }` with an epsilon of 1.0; the resulting "sorted" vector is `{ 0.0, 2.0, 1.0 }` even though according to `less` both `0.0` and `1.0` are equal and therefore should be adjacent. In general, epsilon-based comparisons are not transitive, and therefore cannot be used to define an order that `sort` or `map` can use. — Matthieu M., Sep 25 '18 at 12:00
This may be far too late to be of any use to you @jsadusk, but for future readers of this question it may be helpful to know that my (recently published) [copse](https://crates.io/crates/copse) crate provides an alternative BTreeSet that can be instantiated with a custom/runtime-defined comparator. — eggyal, Jan 16 '23 at 19:48

Peter Hall · Answer 1 · 2018-09-25T11:57:55.160

There is a reason why floating point types do not implement Ord. The internals of BTreeMap make assumptions about implementors of Ord, which allow it to be more efficient, but if those assumptions turn out to be incorrect, then it can lead to Undefined Behaviour, because these assumptions are relied upon in unsafe code. Floating points cannot satisfy these assumptions because of the existence of NaN, values that represent infinity and the nature of floating point arithmetic which means the "same" number can have different binary representations.

Your C++ code is likely to suffer from the same issues, but sometimes code with Undefined Behaviour just seems to work. Until one day when it doesn't - that's just the nature of Undefined Behaviour!

Floating points are good for representing measures or where the values can have massively varying orders of magnitude. If you do a calculation about the distances between cities and the numbers are off by a few nanometres, you don't care. You'll never have to find another city that is exactly the same distance as London is from New York. More likely you'll be happy to search for a city that is the same distance to the nearest 1 km - a number that you can compare as an integer.

Which brings me to the question of why are you using floating point numbers as keys? What does it mean to you and what are you trying to store there? Is f64::NAN a value that you want to be valid? Is 45.00000000001 the "same" as 45.00000000001001? Are you just as likely to be storing very large numbers as very tiny numbers? Is exact equality something that makes sense for these numbers? Are they the result of a computation, which could have rounding errors?

It's not possible to tell you what to do here, but I can suggest that you look at your real problem and model it in a way that reflects your real constraints. If you only care about a specific precision, then round the numbers to that precision and store them in a fixed-point type, or an integer, or perhaps a rational, all of which implement Ord.

Based on your C++ code, it looks like you are happy with a precision of 0.001. So you can store your keys as integers - you'll just have to remember to do the conversion and multiply/divide by 1000 as appropriate. You'll have to deal with the possibility of NaN and infinities, but you'll do it in safe code, so you won't have to worry about UB.

Here is how you could use the num-rational crate to get something with similar behaviour to your C++ code:

extern crate num_rational; // 0.2.1

use num_rational::Rational64;
use std::collections::BTreeMap;

fn in_thousandths(n: f64) -> Rational64 {
    // you may want to include further validation here to deal with `NaN` or infinities
    let denom = 1000;
    Rational64::new_raw((n * denom as f64) as i64, denom)
}

fn main() {
    let mut map = BTreeMap::new();

    map.insert(in_thousandths(1.0), 1);
    map.insert(in_thousandths(2.0), 2);
    map.insert(in_thousandths(3.0), 3);

    let value = map.get(&1.into());
    assert_eq!(Some(&1), value);
}

`in_thousandths(std::f64::NAN)` -> `Ratio { numer: -9223372036854775808, denom: 1000 }`. Still not a good one :( Float sucks! — hellow, Sep 25 '18 at 11:55

Stargateur · Answer 2 · 2018-09-25T07:50:47.823

-2

I think you should use your own type and implement yourself Ord trait:

#[derive(PartialOrd, Debug)]
struct MyNumber {
    value: f64,
}

impl MyNumber {
    const EPSILON: f64 = 0.001;
    fn new(value: f64) -> Self {
        Self { value }
    }
}

impl PartialEq for MyNumber {
    fn eq(&self, other: &MyNumber) -> bool {
        if f64::abs(self.value - other.value) < MyNumber::EPSILON {
            return true;
        } else {
            return false;
        }
    }
}

impl Eq for MyNumber {}

use std::cmp::Ordering;

impl Ord for MyNumber {
    fn cmp(&self, other: &MyNumber) -> Ordering {
        if self == other {
            Ordering::Equal
        } else if self < other {
            Ordering::Less
        } else {
            Ordering::Greater
        }
    }
}

fn main() {
    use std::collections::BTreeMap;

    let mut map = BTreeMap::new();
    map.insert(MyNumber::new(1.0), 1);
    map.insert(MyNumber::new(2.0), 2);
    map.insert(MyNumber::new(3.0), 3);

    println!("{:?}", map.get(&MyNumber::new(1.0001)));
}

But I think it doesn't match the requirement for a BTreeMap.

edited Sep 25 '18 at 07:50

answered Sep 25 '18 at 07:39

Stargateur

24,473
8
65
91

There is a reason that f64 does not implement `Ord` and you should not blindly implement it. This will lead to various errors! (Think what will happen in case of `Inf` or `Nan`) – hellow Sep 25 '18 at 07:48
@hellow Yes, but that not me who want to do that. – Stargateur Sep 25 '18 at 07:51
But you are pointing into that direction. You should not suggest a "wrong" solution. For a specific reason why one should not do this, please read https://stackoverflow.com/questions/39638363/how-can-i-use-a-hashmap-with-f64-as-key-in-rust – hellow Sep 25 '18 at 07:58
2

Your newtype could enforce that infinities and NaN are not allowed, a bit like NonZeroUsize and friends. That would remove the UB. – Peter Hall Sep 25 '18 at 08:01
@PeterHall This is not enough NaN and other is not the only problem, Ord require "transitive, a < b and b < c implies a < c. The same must hold for both == and >." with this code we can have a == b equal to true, b == c also equal to true but a == c could be false. But [here](https://play.rust-lang.org/?gist=8b803eb567b4c0689ff2f773d6161013&version=stable&mode=debug&edition=2015) my attempt to make the code unreadable ;) so whatever this is impossible to make for a BTreeMap. The type is simple not designed for this kind a data. – Stargateur Sep 25 '18 at 08:22
@Stargateur Oh yeah, I forgot about the symmetry with == – Peter Hall Sep 25 '18 at 08:30
@Stargateur The documentation on both [`PartialEq`](https://doc.rust-lang.org/std/cmp/trait.PartialEq.html) and [`PartialOrd`](https://doc.rust-lang.org/std/cmp/trait.PartialOrd.html) require that implementing types satisfy transitivity. Are you implying that floating point types are erroneously implementing these traits? – E_net4 Sep 25 '18 at 08:56
@E_net4 No, but standard just compare with strict equality but it's quite pointless for floating number. I just said that if you use an epsilon value you can't have transitivity. – Stargateur Sep 25 '18 at 09:12

Jmb · Answer 3 · 2018-09-25T08:39:32.047

The way to do that in rust is to wrap your key in a newtype and implement Ord the way you want it:

use std::cmp::Ordering;
use std::collections::BTreeMap;

const EPSILON: f64 = 0.001;

struct MyFloat (f64);

impl Ord for MyFloat {
    fn cmp (&self, other: &MyFloat) -> Ordering {
        if (self.0 - other.0).abs() < EPSILON {
            // I assume that this is what you wanted even though the C++ example
            // does the opposite
            Ordering::Equal
        } else if self.0 < other.0 {
            Ordering::Less
        } else {
            Ordering::Greater
        }
    }
}

impl PartialOrd for MyFloat {
    fn partial_cmp (&self, other: &MyFloat) -> Option<Ordering> {
        Some (self.cmp (other))
    }
}

impl PartialEq for MyFloat {
    fn eq (&self, other: &MyFloat) -> bool {
        (self.0 - other.0).abs() < EPSILON
    }
}

impl Eq for MyFloat {}

fn main() {
    let mut map = BTreeMap::new();
    map.insert (MyFloat (1.0), 1);
    map.insert (MyFloat (2.0), 2);
    map.insert (MyFloat (3.0), 3);

    let found = map.get (&MyFloat (1.00001));
    println!("Found: {:?}", found);
}

playground

Note that this will only work if the keys are finite and spaced by more than EPSILON.

There is a reason that f64 does not implement `Ord` and you should not blindly implement it. This will lead to various errors! (Think what will happen in case of `Inf` or `Nan`) — hellow, Sep 25 '18 at 07:48
That issue is also present in the OP's C++ code. I've edited the answer to make the limitations more apparent. — Jmb, Sep 25 '18 at 08:37
@Jmb: If there are issues in the OP code, then the appropriate answer is to correct the OP's assumptions (a frame challenge) rather than blindly reproduce the errors in the answer. — Matthieu M., Sep 25 '18 at 15:07

Is there an equivalent to C++ comparator objects for BTreeMap and other things that rely on Ord?

3 Answers3