java method: java.lang.Integer.numberOfLeadingZeros(int) can be optimized

Question

the origin code is :

public static int numberOfLeadingZeros(int i) {
    // HD, Figure 5-6
    if (i == 0)
        return 32;
    int n = 1;
    if (i >>> 16 == 0) { n += 16; i <<= 16; }
    if (i >>> 24 == 0) { n +=  8; i <<=  8; }
    if (i >>> 28 == 0) { n +=  4; i <<=  4; }
    if (i >>> 30 == 0) { n +=  2; i <<=  2; }
    n -= i >>> 31;
    return n;
}

I think it can be optimized ,should add following condition:

if (i < 0)
        return 0;

the fully optimized code is :

public static int numberOfLeadingZeros(int i) {
    if(i<=0) {
        return i < 0 ? 0 : 32;
    }
    int n = 1;
    if (i >>> 16 == 0) { n += 16; i <<= 16; }
    if (i >>> 24 == 0) { n +=  8; i <<=  8; }
    if (i >>> 28 == 0) { n +=  4; i <<=  4; }
    if (i >>> 30 == 0) { n +=  2; i <<=  2; }
    n -= i >>> 31;
    return n;
}

My question is: Is the optimized code correct? Or, Is my think right? — Jason, Oct 12 '17 at 09:28
since I have no idea what it is you're trying to achieve ... don't know — Stultuske, Oct 12 '17 at 09:29
If its just an "optimization", write some unit tests to prove the correctness of your implementation! — Gyro Gearless, Oct 12 '17 at 09:38
I think the optimized code is more efficient, Is that right? — Jason, Oct 12 '17 at 09:39
The introduced possibility for branch misprediction penalties mean that this is *extremely* unlikely to be an optimization. — Cody Gray - on strike, Oct 17 '17 at 08:56
@CodyGray, would you please give more details/proof about the branch mis-prediction penalties? — Jason, Oct 17 '17 at 10:06
If I had wanted to post a complete, referenced answer, I would have done so. I'm not a Java programmer, just an x86 optimization expert. This is highly architecture-dependent, and there's tons of prior work on the topic. For starters, see the canonical: https://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array. Other optimization/performance resources available in the [[tag:x86]] tag wiki. — Cody Gray - on strike, Oct 17 '17 at 10:09

assylias · Answer 1 · 2017-10-12T09:43:47.307

5

In theory yes, your suggestion makes sense.

In practice, unless you use an exotic JVM, it will not make any difference because the method is intrinsic, so the code that is executed is not the code you can find in the Java class.

For example on x86/64 cpus, the code is here and uses the bsrl CPU instruction, which is as fast as you can hope for.

edited Oct 12 '17 at 09:43

answered Oct 12 '17 at 09:38

assylias

321,522
82
660
783

so the intrinsic method source in JDK is just for reading, it will never be executed ? – Jason Oct 12 '17 at 09:55
@Jason it depends on the JVM running your program - if that JVM doesn't have intrinsic code for the method, it will run the java code. – assylias Oct 12 '17 at 09:57
So, My suggestion is still useful for the JVM, right? – Jason Oct 12 '17 at 10:00
@Jason As explained, the JVM will execute a few cpu instructions, not java code, so it doesn't need the java code optimisation. – assylias Oct 12 '17 at 10:02
So It means in Open-JDK or Oracle hotspot, the method marked as intrinsic will never run the java code. right? then why not just removed the java code, only run it like call a native method? – Jason Oct 12 '17 at 10:10
check this: https://stackoverflow.com/questions/23041036/why-do-java-intrinsic-functions-still-have-code – Jason Oct 12 '17 at 10:22
pls check the above link, seems like the intrinsic method will still run the java code in some condition? – Jason Oct 12 '17 at 10:29
@Jason yes it will run until the JVM optimises it - which means that there are only two options: either you only call the method a few times and it will be "slow" but it doesn't matter (because you don't use it often), or you call the method a lot and will be optimised. Either way the impact of your proposed optimisation on execution time will be minimal. – assylias Oct 12 '17 at 10:35
quoted from the above link :"A default implementation is provided in case there's no possibility of using the intrinsic version, i.e running on a platform for which no intrinsic version is provided." in this situation, the java method will be run and will 'slow', right? – Jason Oct 12 '17 at 10:43
@Jason yes it will. – assylias Oct 12 '17 at 10:44

Holger · Answer 2 · 2017-10-17T10:15:17.630

2

Besides the fact that this method will likely get replaced by an intrinsic operation for hot spots, this check for negative numbers is only an improvement, if the number is negative. For positive numbers, it is just an additional condition to be evaluated.

So the worth of this optimization depends on the likelihood of negative arguments at this function. When I consider typical use cases of this function, I’d consider negative values a corner case rather than typical argument.

Note that the special handling of zero at the beginning is not an optimization, but a requirement as the algorithm wouldn’t return the correct result for zero without that special handling.

Since your bug report yield to finding an alternative (also shown in your updated question) which improves the negative number case without affecting the performance of the positive number case, as it fuses the required zero test and the test for negative numbers into a single pre-test, there is nothing preventing the suggested optimization.

edited Oct 17 '17 at 10:15

answered Oct 12 '17 at 14:11

Holger

285,553
42
434
765

There are four reasons why the optimization is worthy to do,1.in logic:if the param is negative, the following Binary Search is totally not necessary , this can cut the bit shift operation and save CPU power. 2.in practice, there are lots of micro CPU devices which are sensitive for performance, the optimization is meaningful for them. 3.the likelihood of negative arguments is large in some condition, for example, accept random numbers from outside.4.the code is apparently copied from HD book and did not well considered(optimized) the signed negative numbers. – Jason Oct 12 '17 at 16:17
“accept random numbers from outside” and pass them to `numberOfLeadingZeros` is a far fetched scenario of no practical relevance. I checked all JDK internal uses as well as some other applications, none included the possibility of negative numbers. – Holger Oct 12 '17 at 19:09
@Eugene: well, that's not so hard with an IDE and they where within the set of expected kinds of use cases, which really helps analyzing them. – Holger Oct 12 '17 at 22:21

score 1 · Answer 3 · answered Oct 12 '17 at 16:33

1

Bug has been created on oracle bug database: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8189230

answered Oct 12 '17 at 16:33

Jason

521
2
7

@Holger,@Eugene,@assylias, according to the bug comment from above link, seems that Java official develop team agree with my suggestion too. also they make a further optimization suggestion based on it. – Jason Oct 13 '17 at 09:42
Well, with the suggested change (use `if(i <= 0) return i == 0? 32: 0;` instead), the number of conditionals evaluated for the ordinary (positive number) case is the same as in the old code, so now it is indeed a change that may improve the negative number case without hurting the positive number case, so it’s worth considering the change (keep in mind the comment there: “Better performance should be proved of course”). – Holger Oct 16 '17 at 14:42
@Holger actually,the suggested change given by java official developer is not the best, the better code should be ‘if(i <= 0) return i < 0? 0 : 32;’. And of course better performance already been proved by testing. – Jason Oct 16 '17 at 20:17
Why should `i < 0? 0 : 32` be better than `i == 0? 32: 0`? – Holger Oct 17 '17 at 08:16
Because the `==` operation is slower than `<` operation. If we put the `==` in front , then it will slow down the speed when `i` is not 0. You can run a test, simple proved. – Jason Oct 17 '17 at 08:21
@Holger here is the test code:_italic_ **bold** `public static void main(String[] args) { long st = System.currentTimeMillis(); for(int i=0 ; i>Integer.MIN_VALUE; i--) { slow(i); fast(i); } System.out.println("millis:" + (System.currentTimeMillis() - st)); } public static int slow(int i) { if(i<=0) { return i == 0 ? 32 : 0; } return 1; } public static int fast(int i) { if(i<=0) { return i < 0 ? 0 : 32; } return 1; }` – Jason Oct 17 '17 at 08:32
Perhaps, you should read [How do I write a correct micro-benchmark in Java?](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) first. This “test” is ridiculous. – Holger Oct 17 '17 at 08:37
@Holger, please watch you mouth, the test is designed to prove the case when param is negative. just show the effect of `==` operation. – Jason Oct 17 '17 at 08:39
1

Please, just read the linked Q&A. Your test makes *several* fundamental mistakes listed there. Whether using an implementation that favors the negative case makes any sense for real life applications, is an entirely different question. If I were the maintainer of that method, I’d use a branch free solution like `if(i<=0) return ((i>>>31)^1)<<5;` that does not favor one case over the other, not that an optimization that affects the interpreted execution only is that important. But anyway, it’s worth understanding the linked Q&A for the future… – Holger Oct 17 '17 at 08:57
@Holger, shifting bits is also slower than using a ternary operator. – Jason Oct 17 '17 at 09:40
@Holger, so finally, all the discussion proved that the original code of this method is not well considered about the negative param value, and it should/can be optimized. now we are just discussing how better it can be optimized. – Jason Oct 17 '17 at 09:56
@Holger, also, about the bit shifting, it is not friendly for code reading, and it is not faster(if it is not slow) than ternary operator anyway, so there is no reason to use bit-shifting when ternary operator is available. – Jason Oct 17 '17 at 10:02
My answer considered the negative case not worth optimizing *at the expense of positive number case*, as your first suggestion would have done, but after the bug report revealed an alternative that can optimize the negative case without affecting the positive number, there is no reason not to do the optimization. You already got my upvote. – Holger Oct 17 '17 at 10:02
Conditionals can be very expensive, depending on the underlying architecture. The whole point of the algorithm seen in this method is to reduce the maximum number of conditionals the code has to perform. If the zero value didn’t *require* to be special cased, there wasn’t that first `if` at all. You should also have noticed that being easy-to-read is not what this method is about, especially not to someone unfamiliar with bit shifting… – Holger Oct 17 '17 at 10:08
Yeah, you got my upvote too, besides, I think the code should be like this will be better:`if(i > 0){do origin-bit-shifting-logic and return n;};return ((i>>>31)^1)<<5;`.because the most case is positive numbers, so the negative condition logic should be put in last. – Jason Oct 17 '17 at 10:46
That may be useful for environments executing it 1:1. You may suggest it in the bug report and see, how much micro-optimization the JRE maintainers are willing to put into this method (or whether they find an exotic test environment where this is not an improvement). – Holger Oct 17 '17 at 10:52

java method: java.lang.Integer.numberOfLeadingZeros(int) can be optimized

3 Answers3