21

Assumption -- The Vec<f32> does not have any NaN values or exhibit any NaN behavior.

Take the following sample set:

0.28  
0.3102
0.9856
0.3679
0.3697
0.46  
0.4311
0.9781
0.9891
0.5052
0.9173
0.932 
0.8365
0.5822
0.9981
0.9977

What is the neatest and most stable way to get the index of the highest value in the above list (values can be negative)?

My initial attempts were along the following lines:

let _tmp = *nets.iter().max_by(|i, j| i.partial_cmp(j).unwrap()).unwrap();    
let _i = nets.iter().position(|&element| element == _tmp).unwrap();

Where nets is a &Vec<f32>. Which to me seems blatantly incorrect.

The Python equivalent of this that works (taking into consideration the above assumption):

_i = nets.index(max(nets))
Braiam
  • 1
  • 11
  • 47
  • 78
Juxhin
  • 5,068
  • 8
  • 29
  • 55
  • _"Where nets is a `&Vec`. Which to me seems blatantly incorrect."_ — Does that mean you think there is something wrong with `Vec` or that you have made a mistake? – Peter Hall Dec 23 '18 at 12:29
  • @PeterHall -- That I made a mistake. :-) – Juxhin Dec 23 '18 at 12:33
  • See also [Using max_by_key on a vector of floats](https://stackoverflow.com/q/37127209/155423); [How do I get the minimum or maximum value of an iterator containing floating point numbers?](https://stackoverflow.com/q/28446632/155423); [How to do a binary search on a Vec of floats?](https://stackoverflow.com/q/28247990/155423); [Why does Rust not implement total ordering via the Ord trait for f64 and f32?](https://stackoverflow.com/q/26489701/155423); etc. – Shepmaster Dec 24 '18 at 15:09
  • *`nets` is a `&Vec`* — [Why is it discouraged to accept a reference to a String (&String), Vec (&Vec) or Box (&Box) as a function argument?](https://stackoverflow.com/q/40006219/155423). – Shepmaster Dec 24 '18 at 15:10
  • Thanks @Shepmaster! – Juxhin Dec 24 '18 at 15:34

5 Answers5

26

Is there a reason why this wouldn't work?

>= Rust 1.62.0 (2022-06-30)

use std::cmp::Ordering;
   
fn example(nets: &Vec<f32>) {
    let index_of_max: Option<usize> = nets
        .iter()
        .enumerate()
        .max_by(|(_, a), (_, b)| a.total_cmp(b))
        .map(|(index, _)| index);
}

< Rust 1.62.0 (2022-06-30)

use std::cmp::Ordering;
   
fn example(nets: &Vec<f32>) {
    let index_of_max: Option<usize> = nets
        .iter()
        .enumerate()
        .max_by(|(_, a), (_, b)| a.partial_cmp(b).unwrap_or(Ordering::Equal))
        .map(|(index, _)| index);
}
yanorei32
  • 3
  • 3
Emily
  • 693
  • 5
  • 10
  • 7
    [*You should not use `partial_cmp(b).unwrap_or(Ordering::Equal)` because it provides unstable results when NaNs are present*](https://stackoverflow.com/a/50308360/155423). This OP specifically guarantees that there's no NaN, but answers are expected to warn people that don't pay attention. – Shepmaster Dec 24 '18 at 15:03
  • But it was stated that NaNs weren't in the data set. – Emily Dec 24 '18 at 15:04
  • 11
    I would just `expect("You promised no NaNs!")` instead of `unwrap_or` to fit the OPs guarantee. Also, take a slice as the argument instead of a Vec. – Sebastian Redl Dec 24 '18 at 19:46
  • 1
    I agree with @SebastianRedl, it's better to `expect` or even just `unwrap`. Even though the question was stated as assuming no `NaN`s, _if a `NaN` did occur_ then this solution could give an incorrect answer (the position of a `NaN` value). – Peter Hall Dec 26 '18 at 02:10
7

The reason why this is tricky is because f32 does not implement Ord. That is because NaN values prevent floating point numbers from forming a total order, which violates the contract of Ord.

There are 3rd party crates that work around this by defining a numeric type wrapper which is not allowed to contain a NaN. One example is ordered-float. If you use this crate to first prepare the collection to contain NotNan values, then you can write code very close to your original idea:

use ordered_float::NotNan;

let non_nan_floats: Vec<_> = nets.iter()
    .cloned()
    .map(NotNan::new)       // Attempt to convert each f32 to a NotNan
    .filter_map(Result::ok) // Unwrap the `NotNan`s and filter out the `NaN` values 
    .collect();

let max = non_nan_floats.iter().max().unwrap();
let index = non_nan_floats.iter().position(|element| element == max).unwrap();

Add this to Cargo.toml:

[dependencies]
ordered-float = "1.0.1"

Bonus material: The type conversion can be made truly zero-cost (assuming you are really sure that there are no NaN values!), by taking advantage of the fact that NotNan has a transparent representation:

let non_nan_floats: Vec<NotNan<f32>> = unsafe { mem::transmute(nets) };
Peter Hall
  • 53,120
  • 14
  • 139
  • 204
  • Thanks Peter, I was under that impression. Infact that's why I stated the assumption in the first line. However being as this is a port from a <100-line Python script, I wanted to stick to std lib. Now everything is finally working. :-) – Juxhin Dec 23 '18 at 13:02
  • 2
    @Juxhin Please bear in mind that one of the design philosophies of the Rust standard library is to adopt new APIs slowly and to instead encourage functionality to emerge in the community. As such, you'll be hard pressed to find a Rust project that doesn't have external dependencies, and getting to know the popular crates is really a big part of being productive in Rust. – Peter Hall Dec 23 '18 at 13:10
  • without create a collection: `let result = nets .iter() .cloned() .map(NotNan::new) .enumerate() .filter_map(|(i, nan)| { let nan = nan.ok()?; Some((i, nan))}) .max_by(|a, b| a.1.cmp(&b.1));` – Stargateur Dec 23 '18 at 14:57
  • `.map(NotNan::new).filter_map(Result::ok)` -> `.flat_map(NotNan::new)` – Shepmaster Dec 24 '18 at 16:37
6

I will probably do something like this:

fn main() -> Result<(), Box<std::error::Error>> {
    let samples = vec![
        0.28, 0.3102, 0.9856, 0.3679, 0.3697, 0.46, 0.4311, 0.9781, 0.9891, 0.5052, 0.9173, 0.932,
        0.8365, 0.5822, 0.9981, 0.9977,
    ];

    // Use enumerate to get the index
    let mut iter = samples.iter().enumerate();
    // we get the first entry
    let init = iter.next().ok_or("Need at least one input")?;
    // we process the rest
    let result = iter.try_fold(init, |acc, x| {
        // return None if x is NaN
        let cmp = x.1.partial_cmp(acc.1)?;
        // if x is greater the acc
        let max = if let std::cmp::Ordering::Greater = cmp {
            x
        } else {
            acc
        };
        Some(max)
    });
    println!("{:?}", result);

    Ok(())
}

This could be implemented by adding a trait on Iterator with for example the function try_max_by.

Stargateur
  • 24,473
  • 8
  • 65
  • 91
4

You can find the maximum value with the following:

let mut max_value = my_vec.iter().fold(0.0f32, |max, &val| if val > max{ val } else{ max });

After finding max_value you can track its position in the vector itself:

let index = my_vec.iter().position(|&r| r == max_value).unwrap();

To get this result you need to iterate twice over the same vector. To improve the performance, you can return the index value with the max value as tuple in the fold iteration.

Playground

Akiner Alkan
  • 6,145
  • 3
  • 32
  • 68
  • You can find max value with single line by: `my_vec.iter().fold(0.0f32, |max, &val| if val > max{ val } else{ max });`, and unwrapping makes code unstable – Ömer Erden Dec 23 '18 at 12:43
  • @Ömer Panicking when you get data you don't know how to deal with is usually preferable to proceeding with bad data, in my opinion. If NaN should be impossible, `unwrap` isn't the wrong way to deal with it – trent Dec 23 '18 at 14:39
  • @trentcl Thanks for pointing it out, i thought it could panic on empty vector then this code can be considered as neat and stable, with the assumption about NAN – Ömer Erden Dec 23 '18 at 15:04
  • Oh, hmm. I misinterpreted the code a little. It does appear to panic on an empty Vec and I agree that it's an issue. – trent Dec 23 '18 at 16:24
  • Yes, I wrote it as a pseudo-code and yes i assumed that calling index check after that the max value is existing in the vec and found in the vec. So, if you are going to call the check index in different code block then you need to check it's max value existency first – Akiner Alkan Dec 23 '18 at 16:56
4

I took the answer from @Akiner Alkan and tweaked it a bit, here's a simple one-liner without any unwrap, doing the job:

let maxi = my_vec.iter().enumerate().fold((0, 0.0), |max, (ind, &val)| if val > max.1 {(ind, val)} else {max});

(PS: new to rust and first post in StackOverflow, don't judge me if I do it wrong :D)

litchipi
  • 53
  • 1
  • 5