11

When reading the source code, I stumbled upon this method in the JDK sources. Please note the declaration and initialization of v and newValue. We have here 'nice' undefined values, assignment in comparisons, which is 'great', and extra brackets for worse readability. And other code smells.

default V computeIfAbsent(K key, Function<? super K, ? extends V> mappingFunction) {
    Objects.requireNonNull(mappingFunction);
    V v;
    if ((v = get(key)) == null) {
        V newValue;
        if ((newValue = mappingFunction.apply(key)) != null) {
            put(key, newValue);
            return newValue;
        }
    }

    return v;
}

But why? Is there any actual benefit to writing code the above way instead of the simple (ideally with negated v comparison):

default V computeIfAbsent(K key, Function<? super K, ? extends V> mappingFunction) {
    Objects.requireNonNull(mappingFunction);
    V v  = get(key);
    if (v == null) {
        V newValue = mappingFunction.apply(key);
        if (newValue != null) {
            put(key, newValue);
            return newValue;
        }
    }

    return v;
}

Is there any actual benefit I'm not aware of (besides showing off Java constructs), rather than going with the 'easy' way?

Boann
  • 48,794
  • 16
  • 117
  • 146
Martin Mucha
  • 2,385
  • 1
  • 29
  • 49
  • 3
    If `v` wasn't local it could prevent race conditions where `v` is changed between setting and reading. But here the only advantage seems to scare off people from touching the code :-) – Michael Butscher Oct 26 '18 at 20:48
  • 1
    This is from `ConcurrentMap`, right? You might want to link to the exact source and version to make this clear. I've seen this sort of thing in many JDK classes; it's probably idiomatic to the author but doesn't really obey strict style guides. Anyways, a relevant question. – Mick Mnemonic Oct 26 '18 at 20:48
  • 5
    Purely difference of opinion when it comes to style. – Joe C Oct 26 '18 at 20:49
  • 2
    It's just a peculiar code style (Probably Doug Lea's? It would break [his own conventions](http://gee.cs.oswego.edu/dl/html/javaCodingStd.html): _"Avoid assignments (``='') inside if and while conditions. "_). Would not recommend this in your own code for exactly the reasons you outlined. – Petr Janeček Oct 26 '18 at 20:50
  • Mick Mnemonic: it's just in Map. java.util.Map#computeIfAbsent I do like more readable code, indeed, but I don't think JDK authors are out of their minds, thus I was asking, why they opted for seemingly suboptimal expression. – Martin Mucha Oct 26 '18 at 21:11
  • Michael Butscher: can you elaborate on how it could help if `v` is not local and it's say a field? There is some guarantee of atomicity of this operation when assign and test are followed as in this example? I'm not aware of that, but that's the reason why I'm the one asking questions here ;) – Martin Mucha Oct 26 '18 at 21:13
  • There are similar forms of "microoptimization" in the JDK code (likely by the same author) - for example, https://stackoverflow.com/questions/2785964/in-arrayblockingqueue-why-copy-final-member-field-into-local-final-variable – Marco13 Oct 26 '18 at 23:01
  • The assign-and-test idiom goes way back to early `C` when compilers were less-than-great at optimizing. This idiom would generate _much_ better assembly instructions than separate assignments and tests. Java inherits a lot from C & C++ and the idiom has carried over. When you're familiar with certain idioms, those idioms can be _clearer_ than more verbose code, as you take it all in at once and immediately know what's being done; but the familiarity of course only comes with long(er) experience. – Stephen P Oct 26 '18 at 23:24
  • 1
    This question shouldn't have been closed as opinion-based. It's not asking for an opinion, it's asking objectively whether there is a benefit or if it is a matter of opinion. As usual, close voters put in the minimum effort while endlessly clicking through questions and trying to feel important. – Boann Oct 27 '18 at 20:27
  • Boann: well is it opinion-based? I asked why to use it. Some commenters expressed opinion, that it's just a matter of style and it's code smell. And that's subjective ~opinion based. While commenter of accepted answer demonstrated proof, that given construct is more performant, and make sense in heavy used code. And democratically speaking, his response even has most votes so far. So what's the problem? And to conclude: I'm willing to change best answer, but so far Stesterov has objectively best answer. – Martin Mucha Oct 27 '18 at 20:40

3 Answers3

9

#microoptimization (but in case of a standard library it could matter), and:

#inertia: this pattern was common among C programmers back in the 90-ies, so the titans of computer science may still use this style.

There's no point to write such code for new business logic, unless performance is really critical.


The (micro)optimization:

The bytecode produced by javac (JDK 11) for the original ("bad") version is one JVM-operation less than the (nicer) code. Why? The JDK's version "uses" the return value of the assignment operator (rather than loading the value from a variable) for the if condition evaluation.

However, this is more a limitation of javac's optimization possibilities than a reason to write the less-readable code.

Here's the bytecode for the JDK version, cited in the question:

   0: aload_2
   1: invokestatic  #2                  // Method java/util/Objects.requireNonNull:(Ljava/lang/Object;)Ljava/lang/Object;
   4: pop
   5: aload_0
   6: aload_1
   7: invokevirtual #3                  // Method get:(Ljava/lang/Object;)Ljava/lang/Object;
  10: dup
  11: astore_3
  12: ifnonnull     39
  15: aload_2
  16: aload_1
  17: invokeinterface #4,  2            // InterfaceMethod java/util/function/Function.apply:(Ljava/lang/Object;)Ljava/lang/Object;
  22: dup
  23: astore        4
  25: ifnull        39
  28: aload_0
  29: aload_1
  30: aload         4
  32: invokevirtual #5                  // Method put:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
  35: pop
  36: aload         4
  38: areturn
  39: aload_3
  40: areturn

Below is the bytecode of a more readable version:

public V computeIfAbsent(K key,
                         Function<? super K, ? extends V> mappingFunction) {
    Objects.requireNonNull(mappingFunction);
    final V v = get(key);
    if (v == null) {
        final V newValue = mappingFunction.apply(key);
        if (newValue != null) {
            put(key, newValue);
            return newValue;
        }
    }

    return v;
}

.. and the bytecode is:

   0: aload_2
   1: invokestatic  #2                  // Method java/util/Objects.requireNonNull:(Ljava/lang/Object;)Ljava/lang/Object;
   4: pop
   5: aload_0
   6: aload_1
   7: invokevirtual #3                  // Method get:(Ljava/lang/Object;)Ljava/lang/Object;
  10: astore_3
  11: aload_3
  12: ifnonnull     40
  15: aload_2
  16: aload_1
  17: invokeinterface #4,  2            // InterfaceMethod java/util/function/Function.apply:(Ljava/lang/Object;)Ljava/lang/Object;
  22: astore        4
  24: aload         4
  26: ifnull        40
  29: aload_0
  30: aload_1
  31: aload         4
  33: invokevirtual #5                  // Method put:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
  36: pop
  37: aload         4
  39: areturn
  40: aload_3
  41: areturn
Alex Shesterov
  • 26,085
  • 12
  • 82
  • 103
  • 1
    I believe this is the correct answer. I presume, that JDK devels cannot drive, well if anythihg at least shouldn't drive their work based on own preferences, but based on measured performance. And core libraries like collections has to be fast. Thus microoptimization makes sense here. My eyes still hurt a little from this code, but that's just my habbit what I'm used to look at. So again, I believe this is the correct answer. Thanks for helping me understand it! – Martin Mucha Oct 26 '18 at 21:20
  • 1
    While this _might_ be an optimization, I think [shmosel's](https://stackoverflow.com/a/53016177/905488) answer makes more sense. Only the author knows the real reason. – Mick Mnemonic Oct 26 '18 at 21:26
  • 1
    "this is more a limitation of javac's optimization possibilities than a reason to write the less-readable code" - isn't optimiser limitations a possible reason to write less readable code? – Bernhard Barker Oct 26 '18 at 22:43
  • @Dukeling: Absolutely, and that's the whole reason that good optimizers are essential: it's not a matter of making the programmer's code run faster, but a deterrence against programmers writing anti-idiomatic, unreadable code for the sake of performance. – R.. GitHub STOP HELPING ICE Oct 27 '18 at 00:22
  • 3
    Seriously guys, you cannot make meaningful statements about "optimality" of code by looking at bytecodes. The stuff that is actually executed is native code. The JIT takes these bytecodes, gathers the interpreter stats for them, and then produces *optimized* native code instructions. If you want to try to analyze the code performance based on compiler output, look at the native code. – Stephen C Oct 27 '18 at 00:50
  • It would probably be more meaningful to look at generated native instructions (or possibly bytecode) from the version it was originally written against. – jpmc26 Oct 27 '18 at 01:18
  • Stephen C: good point,just having better byte code need not to necessarily mean, that it's actually faster. I know there was introduced some tool into jdk just to do such performance tests,but I don't possess that skill. So by doing naive test using "computingIfAbsent" for key "test" and all Integer values, ten times in loop, each variant, same system load,both started as individual processes.And by average, optimized variant finished second with time 8598.1ms, where clean version won with 8597.7ms. ~Optimized gives worse outcome by 0.4ms. In my very naive test. If someone can do better one... – Martin Mucha Oct 27 '18 at 21:21
7

I think it's mostly a matter of preference; I'm under the impression Doug Lea prefers this terse style. But if you look at the original version of the method, it makes a little more sense, since it's paired with newValue which does need inline assignment:

default V computeIfAbsent(K key,
        Function<? super K, ? extends V> mappingFunction) {
    V v, newValue;
    return ((v = get(key)) == null &&
            (newValue = mappingFunction.apply(key)) != null &&
            (v = putIfAbsent(key, newValue)) == null) ? newValue : v;
}
shmosel
  • 49,289
  • 6
  • 73
  • 138
5

I agree: it's a code smell and I definitely prefer the second version.

The arcane use of assignment and comparison in a single expression can result in somewhat shorter code in case you have a loop, e.g.:

V v;
while ((v = getNext()) != null) {
    // ... do something with v ...
}

To remove the code smell, you need more code and you need to assign the variable in two places:

V v = getNext();
while (v != null) {
    // ...
    v = getNext();
}

Or you need to move the loop exit condition after the assignment:

while (true) {
    V v = getNext();
    if (v == null)
        break;
    // ...
}

For an if statement, it certainly makes no sense. And even for a loop, I would avoid it.

Codo
  • 75,595
  • 17
  • 168
  • 206
  • 2
    I wouldn't call it arcane by any means. Ubiquitous but not recommended perhaps? – Mad Physicist Oct 26 '18 at 20:57
  • It very much reminds me of old C code like `if (t = p->next)...` which additionally depends on the implicit conversion of a non-null pointer to *true* / null pointer to *false*. But you're right: *arcane* is not the right word. – Codo Oct 26 '18 at 21:02
  • +1 for a great example of where this construct does make code that's more readable/less-DRY-violating. – R.. GitHub STOP HELPING ICE Oct 27 '18 at 00:25