23

I understand why the floats don't have an implementation for Ord but that doesn't particularly help me when I want to be lazy and use iterators.

Is there a workaround or an easy way to take the minimum / min / min_by of an iterator containing floating point numbers?

I know one can sort (which is slow) or wrap it in another type and implement the needed trades (which is verbose) but I am hoping for something a little more elegant.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
luke
  • 1,024
  • 3
  • 11
  • 21

4 Answers4

43

Floats have their own min and max methods that handle NaN consistently, so you can fold over the iterator:

use std::f64;

fn main() {
    let x = [2.0, 1.0, -10.0, 5.0, f64::NAN];

    let min = x.iter().fold(f64::INFINITY, |a, &b| a.min(b));
    println!("{}", min);
}

Prints -10.

If you want different NaN handling, you can use PartialOrd::partial_cmp. For example, if you wish to propagate NaNs, fold with:

use std::f64;
use std::cmp::Ordering;

fn main() {
    let x = [2.0, 1.0, -10.0, 5.0, f64::NAN];

    let min = x.iter().fold(f64::INFINITY, |a, &b| {
        match PartialOrd::partial_cmp(&a, &b) {
            None => f64::NAN,
            Some(Ordering::Less) => a,
            Some(_) => b,
        }
    });
    println!("{}", min);
}
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
huon
  • 94,605
  • 21
  • 231
  • 225
12

If you know your data does not contain NaNs, then assert that fact by unwrapping the comparison:

fn example(x: &[f64]) -> Option<f64> {
    x.iter()
        .cloned()
        .min_by(|a, b| a.partial_cmp(b).expect("Tried to compare a NaN"))
}

If your data may have NaNs, you need to handle that case specifically. One solution is to say that all 16,777,214 NaN values are equal to each other and are always greater than or less than other numbers:

use std::cmp::Ordering;

fn example(x: &[f64]) -> Option<f64> {
    x.iter()
        .cloned()
        .min_by(|a, b| {
            // all NaNs are greater than regular numbers
            match (a.is_nan(), b.is_nan()) {
                (true, true) => Ordering::Equal,
                (true, false) => Ordering::Greater,
                (false, true) => Ordering::Less,
                _ => a.partial_cmp(b).unwrap(),
            }
        })
}

There are numerous crates available that can be used to give you whichever semantics your code needs.


You should not use partial_cmp(b).unwrap_or(Ordering::Equal) because it provides unstable results when NaNs are present, but it leads the reader into thinking that they are handled:

use std::cmp::Ordering;
use std::f64;

fn example(x: &[f64]) -> Option<f64> {
    x.iter()
        .cloned()
        .min_by(|a, b| a.partial_cmp(b).unwrap_or(Ordering::Equal))
}

fn main() {
    println!("{:?}", example(&[f64::NAN, 1.0]));
    println!("{:?}", example(&[1.0, f64::NAN]));
}
Some(NaN)
Some(1.0)
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • The comparison function can be abbreviated to `a.partial_cmp(b).unwrap_or_else(|| a.is_nan().cmp(&b.is_nan()))`, which is shorter, but probably not easier to read. – Sven Marnach Nov 10 '21 at 14:18
8

A built-in total-ordering comparison method for floats named .total_cmp() is now stable, as of Rust 1.62.0. This implements that total ordering defined in IEEE 754, with every possible f64 bit value being sorted distinctly, including positive and negative zero, and all of the possible NaNs. Be aware that some NaNs sort above Infinity, and some NaNs sort below -Infinity, so the "maximum" value may be confusing in the presence of NaN, but it will be consistent.

Floats still won't implement Ord, so they won't be directly sortable, but the boilerplate has been cut down to a single line, without any external imports or chance of panicking:

fn main() {
    let mut a: Vec<f64> = vec![2.0, 2.5, -0.5, 1.0, 1.5];
    
    let maximum = *a.iter().max_by(|a, b| a.total_cmp(b)).unwrap();
    println!("The maximum value was {maximum}.");

    a.sort_by(f64::total_cmp);
}
eisterman
  • 486
  • 1
  • 5
  • 13
  • `max_by_key` expects a function that takes 1 argument, but `f64::total_cmp` takes two. So this code does not compile as provided I think. – SirVer Oct 26 '22 at 18:48
  • 1
    I can confirm this code cannot compile, the right solution is `let maximum = *a.iter().max_by(|a, b| a.total_cmp(b)).unwrap();` – eisterman Dec 14 '22 at 16:51
0

Perhaps like this?

fn main() {
    use std::cmp::Ordering;
    let mut x = [2.0, 1.0, -10.0, 5.0];
    x.sort_by(|a, b| a.partial_cmp(b).unwrap_or(Ordering::Equal));
    println!("min in x: {:?}", x);
}

One thing I struggled with is that sort_by mutates the vector in place so you can't use it in a chain directly.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
dvdplm
  • 691
  • 7
  • 8