As far as I can tell this is an optimization that is pretty neat.
This was previously written like this:
if (keySet == null) { // volatile read
keySet = new AbstractSet<K>() { // volatile write
....
return keySet; // volatile read
These operations can not be re-ordered because there are memory barriers that are inserted here. So it would look like this:
[StoreLoad]
// volatile read
[LoadLoad]
[LoadStore]
[StoreStore]
[LoadStore]
// volatile write
[StoreLoad]
[StoreLoad] // there's probably just one barrier here instead of two
// volatile read
[LoadLoad]
[LoadStore]
There are lots of barriers here and the most expensive would be the StoreLoad
that is emitted on x86
.
Suppose we drop the volatile
here. Since there are no barriers inserted these operations can be re-ordered in any way they please and there are two racy reads here of the keySet
variable.
We can have a single racy read and store the variable into a local field(since they are local, they are thread safe - no one can change the reference that is locally declared), the only problem as far as I can see is that multiple threads might see a null reference at the same time and initialize it with an empty KeySet
and potentially doing too much work; but that is most probably cheaper than the barriers.
On the other hand if some threads sees a non-null reference, it will 100% see a fully initialized object and that is the comment about final
fields. If all objects are final the JMM guarantees a "freeze" action after the constructor; or in simpler words (IMO) if all fields are final and initialized in the constructor there are two barriers inserted after it: LoadStore
and LoadLoad
; thus achieving the same effect.