How to do a binary search on a Vec of floats?

Question

If you have a Vec<u32> you would use the slice::binary_search method.

For reasons I don't understand, f32 and f64 do not implement Ord. Since the primitive types are from the standard library, you cannot implement Ord on them yourself, so it does not appear you can use this method.

How can you effectively do this?

Do I really have to wrap f64 in a wrapper struct and implement Ord on it? It seems extremely painful to have to do this, and involves a great deal of transmute to cast blocks of data back and forth unsafely for effectively no reason.

`f32` and `f64` do not have a total ordering because NaN values are not equal to, greater than, nor less than any other floating point value according to [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754) — RBF06, May 25 '22 at 02:16

score 35 · Accepted Answer · edited May 23 '17 at 10:31

for reasons I don't understand, f32 and f64 do not implement Ord.

Because floating point is hard! The short version is that floating point numbers have a special value NaN - Not a Number. The IEEE spec for floating point numbers states that 1 < NaN, 1 > NaN, and NaN == NaN are all false.

Ord says:

Trait for types that form a total order.

This means that the comparisons need to have totality:

a ≤ b or b ≤ a

but we just saw that floating points do not have this property.

So yes, you will need to create a wrapper type that somehow deals with comparing the large number of NaN values. Maybe your case you can just assert that the float value is never NaN and then call out to the regular PartialOrd trait. Here's an example:

use std::cmp::Ordering;

#[derive(PartialEq,PartialOrd)]
struct NonNan(f64);

impl NonNan {
    fn new(val: f64) -> Option<NonNan> {
        if val.is_nan() {
            None
        } else {
            Some(NonNan(val))
        }
    }
}

impl Eq for NonNan {}

impl Ord for NonNan {
    fn cmp(&self, other: &NonNan) -> Ordering {
        self.partial_cmp(other).unwrap()
    }
}

fn main() {
    let mut v: Vec<_> = [2.0, 1.0, 3.0].iter().map(|v| NonNan::new(*v).unwrap()).collect();
    v.sort();
    let r = v.binary_search(&NonNan::new(2.0).unwrap());
    println!("{:?}", r);
}

One of the slice extension methods is `binary_search_by`, which you could use. `f32`/`f64` implement `PartialOrd`, so if you know they can never be `NaN`, you can unwrap the result of `partial_cmp`: http://doc.rust-lang.org/std/slice/trait.SliceExt.html#tymethod.binary_search_by — BurntSushi5, Jan 31 '15 at 04:50
One can use the [ordered_float](https://crates.io/crates/ordered-float) or [ord_subset](https://crates.io/crates/ord_subset) crate. See http://stackoverflow.com/a/37144472/5189607 — malbarbo, May 10 '16 at 16:48

score 16 · Answer 2 · answered May 07 '16 at 03:23

One of the slice methods is binary_search_by, which you could use. f32/f64 implement PartialOrd, so if you know they can never be NaN, you can unwrap the result of partial_cmp:

fn main() {
    let values = [1.0, 2.0, 3.0, 4.0, 5.0];
    let location = values.binary_search_by(|v| {
        v.partial_cmp(&3.14).expect("Couldn't compare values")
    });

    match location {
        Ok(i) => println!("Found at {}", i),
        Err(i) => println!("Not found, could be inserted at {}", i),
    }
}

score 9 · Answer 3 · 2022-07-19T23:48:59.670

A built-in total-ordering comparison method for floats named .total_cmp() is now stable, as of Rust 1.62.0. This implements that total ordering defined in IEEE 754, with every possible f64 bit value being sorted distinctly, including positive and negative zero, and all of the possible NaNs.

Floats still won't implement Ord, so they won't be directly sortable, but the boilerplate has been cut down to a single line, without any external imports or chance of panicking:

fn main() {
    let mut v: Vec<f64> = vec![2.0, 2.5, -0.5, 1.0, 1.5];
    v.sort_by(f64::total_cmp);

    let target = 1.25;
    let result = v.binary_search_by(|probe| probe.total_cmp(&target));

    match result {
        Ok(index) => {
            println!("Found target {target} at index {index}.");
        }
        Err(index) => {
            println!("Did not find target {target} (expected index was {index}).");
        }
    }
}

Venkat Arun · Answer 4 · 2019-04-02T16:26:57.507

3

If you are sure that your floating point values will never be NaN, you can express this semantic with the wrappers in decorum. Specifically, the type Ordered implements Ord and panics whenever the program tries to do something invalid:

use decorum::Ordered;

fn foo() {
    let ordered = Ordered<f32>::from_inner(10.);
    let normal = ordered.into()
}

edited Apr 02 '19 at 16:26

answered Mar 29 '19 at 14:32

Venkat Arun

31
3

score 1 · Answer 5 · answered Feb 06 '18 at 00:52

https://github.com/emerentius/ord_subset implements a ord_subset_binary_search() method that you can use for this.

from their README:

let mut s = [5.0, std::f64::NAN, 3.0, 2.0];
s.ord_subset_sort();
assert_eq!(&s[0..3], &[2.0, 3.0, 5.0]);
assert_eq!(s.ord_subset_binary_search(&5.0), Ok(2));

assert_eq!(s.iter().ord_subset_max(), Some(&5.0));
assert_eq!(s.iter().ord_subset_min(), Some(&2.0));

score 0 · Answer 6 · answered Jun 26 '23 at 06:09

You can derive Ord on your newtype using nutype with finite validation without much boilerplate. The finite validation excludes NaN and Infinity, what enables for nutype to derive Ord safely:

Here is an example:

use nutype::nutype;

#[nutype(validate(finite))]
#[derive(Debug, PartialEq, Eq, PartialOrd, Ord)]
pub struct Float(f64);

fn main() {
    let raw_data = vec![1.1, 2.5, 0.1, 4.5];
    let mut data: Vec<Float> = raw_data
        .into_iter()
        .map(|val| Float::new(val).unwrap())
        .collect();

    data.sort();
    println!("data = {data:?}");

    let idx25 = data.binary_search(&Float::new(2.5).unwrap());
    println!("2.5 index = {idx25:?}");

    let idx36 = data.binary_search(&Float::new(3.6).unwrap());
    println!("3.6 index = {idx36:?}");
}

Output:

data = [Float(0.1), Float(1.1), Float(2.5), Float(4.5)]
2.5 index = Ok(2)
3.6 index = Err(3)

Note that Float::new(val) returns Result. Passing something like Float::new(0.0/0.0) would return an error.

How to do a binary search on a Vec of floats?

6 Answers6

Linked

Related