2

I had some code I was profiling and was surprised at how much time was being spent on Math.min(float, float).

In my use case I needed to get the min of 3 float values, each value is guaranteed to not be NAN or another edge case float value.

My original method was:

private static float min2(float v1, float v2, float v3) {
    return Math.min(Math.min(v1,v2),v3);
}

But I found that this was about 5x faster:

private static float min1(float v1, float v2, float v3) {
    if (v1 < v2 && v1 < v3) {
        return v1;
    }
    else if (v2 < v3) {
        return v2;
    }
    else {
        return v3;
    }
}

For reference this is the code for Math.min:

public static float min(float f1, float f2) {
    if (f1 > f2) {
        return f2;
    }
    if (f1 < f2) {
        return f1;
    }
    /* if either arg is NaN, return NaN */
    if (f1 != f2) {
        return Float.NaN;
    }
    /* min(+0.0,-0.0) == -0.0 */
    /* 0x80000000 == Float.floatToRawIntBits(-0.0f) */
    if (Float.floatToRawIntBits(f1) == 0x80000000) {
        return -0.0f;
    }
    return f2;
}

Note: My use case was symmetric and the above was all true for max instead of min.

EDIT1: It turns out ~5x was an overstatement, but I am still seeing a speed difference inside my application. Although I suspect that may be due to not having a proper timing test.

After posting this question I wrote a proper micro optimization speed test in a separate project. Tested each method 1000 times on random floats and they both took the same amount of time. I don't think it would be useful to post that code as it's just confirming what we all already thought.

There must be something specific to the project I'm working on causing the speed difference.

I'm doing some graphic work in an Android app, and I was finding the min/max of the values from 3 touch events. Again, edge cases like -0.0f and the different infinities are not an issue here. Values range between 0.0f and say 3000f.

Originally I profiled my code using the Android Device Monitor's Method Profiling tool, which did show a ~5x difference. But, this isn't the best way to micro-profile code as I have now learned.

I added the below code inside my application to attempt to get better data:

long min1Times = 0L;
long min2Times = 0L;
...
// loop assigning touch values to v1, v2, v3
        long start1 = System.nanoTime();
        float min1 = min1(v1, v2, v3);
        long end1 = System.nanoTime();
        min1Times += end1 - start1;
        long start2 = System.nanoTime();
        float min2 = min2(v1, v2, v3);
        long end2 = System.nanoTime();
        min2Times += end2 - start2;
        double ratio = (double) (min1Times) / (double) (min2Times);
        Log.d("", "ratio: " + ratio);

This prints a running ratio with each new touch event. As I swirl my finger on the screen, the first ratios logged are either 0.0 or Infinity or NaN. Which makes me think this test isn't very accurately measuring the time. As more data is collected the ratio tends to vary between .85 and 1.15.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
GDanger
  • 1,641
  • 2
  • 22
  • 35
  • What test values did you use? Did the case where two or more were equal come up a lot? – k_g Feb 10 '15 at 23:51
  • What version of java are you using? – bhspencer Feb 10 '15 at 23:52
  • 3
    Can you post the test you performed? If one ran in 5 ms. and the other in 1 ms., it does says that much. – Bart Kiers Feb 10 '15 at 23:53
  • 2
    Did you allow for warm up? Did you use `System.nanoseconds()`? Please show your test code. Proper micro-benchmarking is not so trivial. – Bohemian Feb 11 '15 at 00:03
  • 1
    Your code doesn't handle NaN or negative zero properly, for a start. – user207421 Feb 11 '15 at 00:07
  • I updated info in my edit, and I now believe I asked this question prematurely. I don't really think it's a duplicate of "How do I write a correct micro-benchmark in Java?" but I do think it should be closed as it's off-topic and hard to benchmark the Android aspect of it. Should it be deleted? – GDanger Feb 11 '15 at 01:09
  • @BartKiers now that would be interesting! Considering that looking at the code they seem the same except for edge cases. I switched your test to use `System.nanoTime()` and ran it, sure enough a very large difference. But, if you swap which test is run first, the difference swaps. The issue is you're using the same `a`, `b`, and `c` for the tests, whichever test runs first has to figure out the min, the second test can just re-use that previous work. – GDanger Feb 11 '15 at 16:20
  • :) funky! I thought I swapped them to be sure there wasn't any VM warm-up time... – Bart Kiers Feb 11 '15 at 16:26
  • Yeah, you're right: after giving both tests their own Random (with the same seed!), both `min(...)` methods spend pretty much the same amount of time. – Bart Kiers Feb 11 '15 at 16:28
  • Related: [What is the instruction that gives branchless FP min and max on x86?](https://stackoverflow.com/q/40196817) - I think ARM has similar min/max instructions that exactly implement `(b – Peter Cordes May 07 '21 at 20:07

3 Answers3

2

The problem is about the precision of float values.

If you would call your method with the arguments (0.0f, -0.0f, 0.0f), it will return you 0.0f to be the smallest float - which it isn't (float-wise, -0.0f is smaller)

the nested Min-Method will return the expected result.

So, to answer your question: If two methods are not 100% equal - there is no point in comparing their performance :-)

Java will handle 0.0f == -0.0f as true, but: new Float(0.0)).equals(new Float(-0.0)) will be false! Math.Min will consider this, your method not.

When working with float values, you should never use smaller or equal to operators. Instead you should compare numbers based on your preselected delta to consider them smaller, greater or equal.

float delta = 0.005
if (Math.abs(f1 - f2) < delta) //Consider them equal.
if (Math.abs(f1 - f2) > delta) // not equal. 

And that's what's happening at the end of the Math.min method - JUST in a very very precise manner by actually checking if one number is -0.0f - bitwise.

So the drawback in performance is just the more prcise result beeing calculated.

However, if you compared float values like "10", "5", and "8" - there shouldnt be a performance difference, cause the 0-check is never hit.

dognose
  • 20,360
  • 9
  • 61
  • 107
1

A problem with the performance of the built-in Math.min function stems from some unfortunate-in-retrospect decisions that were made when formulating the IEEE-754 Standard, especially the behavior of comparison and relational operators. The specified behavior was suitable for some purposes, but unsuitable for some other common purposes.

Most notably, if computation x would yield a positive number that is too small to represent, and computation y would yield a negative number that is too small to represent, then there is no way to directly compare the two values, they're not equivalent. Computation of 1.0/x would be interpreted as division by an infinitesimal positive number, while "1.0/y" behaves as division by an infinitesimal negative number. Thus, even though x and y are infinitesimal, and are close enough that comparison and relational operators report them as equal, both Math.min(x,y) and Math.min(y,x) should return y because it's infinitesimally smaller than x.

It strikes me as crazy that people are still designing hardware and programming languages which lack any nice means of comparing floating-point values in such a way that all pairs of values which aren't fully equivalent to each other from each other will be transitively ranked, but that has unfortunately the state of floating-point math for the last several decades. If one needs a function that will return the minimum of x and y in cases where they represent numbers with a non-zero difference, and otherwise arbitrarily return either x or y, such a function could be written more efficiently than one which has to handle the tricky cases involving positive and negative infinitesimal values.

supercat
  • 77,689
  • 9
  • 166
  • 211
0

Your implementation should lead to some really tight bytecode, which is easily turned into equally fast assembly language by a JIT compiler. The version using Math.min has two subroutine calls, and so may not be inlined like yours is. I think the results are to be expected.

Randall Cook
  • 6,728
  • 6
  • 33
  • 68
  • Hopefully good JVMs use some kind of intrinsic implementation instead of the fully naive version of Math.max ([Why is Java's double/float Math.min() implemented this way?](https://stackoverflow.com/q/67414007)). But it's still going to be more work than a single branchless FP min or max instruction. – Peter Cordes May 07 '21 at 20:12