0

I recently started learning C# and went straight for the memory model. C# and Java have similar (though perhaps not identical) thread safety guarantees regarding reads and writes of volatile fields. But unlike writes to final fields in Java, writes to readonly fields in C# do not provide any specific thread safety guarantee. Thinking about how thread safety works in C# led me to doubt whether there is any real advantage to final fields behaving the way they do in Java.

I learned about the supposed importance of final three years ago. I asked this question and got a detailed answer that I accepted. But now I think it's wrong, or at least irrelevant. I still think fields should be final whenever possible, just not for the reasons commonly believed.

The value of a final field is guaranteed to be visible to any other thread after the constructor returns. But the reference to the object itself must be published in a thread safe manner. If the reference is published safely, then the visibility guarantee of final becomes redundant.

I considered the possibility that it has something to do with public static fields. But logically, a class loader must synchronize the initialization of a class. Synchronization makes the thread safety of final redundant.

So I'm putting forth the heretical idea that the only real value of final is to make immutability self-documenting and self-enforcing. In practice, private non-final fields (and in particular, array elements) are perfectly thread safe as long as they are not modified after the constructor returns.

Am I wrong?

Edit: Paraphrasing section 3.5 of Java Concurrency in Practice,

Two things can go wrong with improperly published objects. Other threads could see a stale value for the reference, and thus see a null reference or other older value even though a value has been set. But far worse, other threads could see an up-to-date value for the reference, but stale values for the state of the object.

I understand how final fields solve the second problem, but not the first problem. The highest voted answer so far argues that the first problem is not a problem.

Edit 2: This question arises from a confusion of terminology.

Like the asker of a similar question, I've always understood the term "safe publication" to mean that both an object's internal state and the reference to the object itself are guaranteed visible to other threads. In favor of this definition, Effective Java cites Goetz06, 3.5.3 in defining "safe publication" as (emphasis added)

Transferring such an object reference from one thread to others

Also in favor of this definition, note that the section of Java Concurrency in Practice paraphrased above refers to potentially stale references as being "improperly published."

Whatever you call it, I didn't think that unsafely publishing a reference to an immutable object could ever be useful. But according to this answer, it can. (The example given there is a primitive value, but the same principle could apply to reference values.)

Kevin Krumwiede
  • 9,868
  • 4
  • 34
  • 82
  • 1
    "private non-final fields (and in particular, array elements) are perfectly thread safe as long as they are not modified after the constructor returns" - that's wrong. If another thread gets a reference to your object without correct synchronization it may see these fields or array values as uninitialized even if they were initialized in the constructor. – Erwin Bolwidt Aug 29 '17 at 00:47
  • @ErwinBolwidt Yes, but OP seems to be under the impression that references must never be shared without synchronization. – shmosel Aug 29 '17 at 00:50
  • @shmosel I think you're right. Also looks like OP thinks that synchronization is an absolute thing, but a proper JMM relationship between threads using synchronization is only achieved by synchronizing on the *same* monitor (likewise, volatile-related rules only apply to the *same* volatile variable, etc), and a classloader doing static initialization is not going to force all other threads to synchronize on its monitor somehow. (btw I mean "correct synchronization" per JMM terminology which can be achieved in various ways) – Erwin Bolwidt Aug 29 '17 at 01:29
  • @ErwinBolwidt I think OP is correct on the issue of static field initialization. See for example https://stackoverflow.com/q/878577/1553851. – shmosel Aug 29 '17 at 01:32
  • @ErwinBolwidt I mean that sentence to be true if and only if references are published safely. – Kevin Krumwiede Aug 29 '17 at 02:07
  • *The highest voted answer so far argues that the first problem is not a problem.* Not quite. The argument is that it's a separate consideration and shouldn't be conflated with the second, "far worse" problem. – shmosel Aug 29 '17 at 07:20
  • Note that the asker of the question you linked (great find btw!) [retracted his interpretation](https://stackoverflow.com/questions/7886577/safe-publication-and-the-advantage-of-being-immutable-vs-effectively-immutable/27261575#comment14755004_7886577). – shmosel Aug 29 '17 at 07:59
  • @shmosel I saw that. I still think the terminology is misleading, in that potentially failing to publish a value can only be considered "safe" under very specific circumstances. Also, the reliability of this probably depends on the architecture. I recently learned that x86/x64 imposes stronger visibility guarantees on Java's weaker ones, but ARM (for example) does not. So quasi-safely publishing immutable objects might work on a PC and fail on Android. – Kevin Krumwiede Aug 29 '17 at 08:11

2 Answers2

4

But the reference to the object itself must be published in a thread safe manner. If the reference is published safely, then the visibility guarantee of final becomes redundant.

The first sentence is wrong; the second is therefore irrelevant. final may be redundant in the presence of other safe publication techniques, like synchronization or volatile. But the point of immutable objects is that they're inherently thread-safe, meaning they'll be seen in a consistent state regardless of how the reference is published. So you don't need those other techniques in the first place, at least as far as safe publication is concerned.

EDIT: OP correctly points out that there's some ambiguity around the term "safe publication". In this context I'm referring to consistency of the object's internal state. The issue of visibility as it affects the reference is, in my opinion, a valid but separate concern.

shmosel
  • 49,289
  • 6
  • 73
  • 138
  • What I'm getting at is that while the objects themselves may be thread safe, the references to them are not automatically safe. For example, it would not be safe for threads to communicate by reading and writing a non-final, non-volatile reference field, *regardless* of the intrinsic thread safety of the type of the field. If the reading thread sees the written reference value, then yes, the object it refers to will be seen as fully initialized. But it may not see the reference value in the first place. – Kevin Krumwiede Aug 29 '17 at 02:02
  • Thread safety can mean a lot of different things. A more relaxed definition could allow for delayed visibility but require memory consistency. It all depends on your specific requirements. – shmosel Aug 29 '17 at 02:07
  • I'm not aware of any "delayed" visibility. AFAIK, values are either guaranteed visible by a happens-before relationship between writing and reading, or their visibility is totally indeterminate. – Kevin Krumwiede Aug 29 '17 at 02:09
  • To clarify, I'm not aware of any eventual consistency *guarantee* in the JMM. It may be a characteristic of a particular implementation or architecture. – Kevin Krumwiede Aug 29 '17 at 02:13
  • I don't either know of such a guarantee, but as far as I know it does generally work that way in practice. But whatever terminology you prefer, my point is that visibility may not be essential to correctness in all cases. – shmosel Aug 29 '17 at 02:16
  • @KevinKrumwiede For a practical example, consider double-checked locking. Normally, we say the field must be `volatile`, or there's the possibility of seeing a non-null variable with a partially constructed object. With immutable objects, that's impossible. – shmosel Aug 29 '17 at 02:31
  • I don't consider double checked locking a practical example of anything. Before Java 1.5, it was notoriously broken. After Java 1.5, the cost of an uncontested lock was so low that any performance advantage of double checked locking became extremely dubious. Anyway, why would you use double checked locking with an immutable object? – Kevin Krumwiede Aug 29 '17 at 05:25
  • I'm obviously talking about post-1.5. As for it being uncontested, that depends on the amount of concurrent reads. Either way, my intent was to provide an actual scenario where correctness isn't hampered by visibility concerns. Whether that scenario would directly be useful to you wasn't really the point. As for why one would use it, it's the same as always: to lazily initialize some value in a thread-safe manner. I don't see why the value's immutability would change anything. – shmosel Aug 29 '17 at 05:46
  • "I don't see why the value's immutability would change anything." - Exactly why I don't understand the relevance of double checked locking to my question. – Kevin Krumwiede Aug 29 '17 at 05:48
  • It changes everything with regard to thread safety, as I already explained. I don't understand why you think DCL is less useful for immutable objects. – shmosel Aug 29 '17 at 05:49
  • No, I think it's equally (albeit dubiously) useful for both immutable and effectively immutable objects, precisely because it involves the use of `volatile` to ensure the visibility of the reference. By "safe publication", I mean visibly consistent state *and* visibility of the reference. I now understand that you and others take it to mean only the former. – Kevin Krumwiede Aug 29 '17 at 05:58
  • My point was that you *wouldn't* need `volatile` in this case. Visibility is not an issue because the variable is observed to be non-null. And consistency is ensured by its immutability. And yes, your understanding of my understanding is accurate. :) – shmosel Aug 29 '17 at 06:05
  • @shmosel `volatile` is not enough for safe publication - as it would not introduce the memory barriers needed. In fact even if you could work around that (and you could), every single `volatile` would introduce a `StoreLoad` which is the most expensive operation on `x86` while `final` with it's `StoreStore|LoadStore` are free. I would remove that `volatile` part – Eugene Aug 30 '17 at 07:35
  • @Eugene I'm not quite sure what you're responding to. – shmosel Aug 30 '17 at 07:39
  • @shmosel my bad... `final may be redundant in the presence of other safe publication techniques, like synchronization or volatile`. volatile is not supposed to be in this sentence. – Eugene Aug 30 '17 at 07:40
  • @Eugene I think you're misunderstanding my answer, and possibly the question as well. The discussion is about making the *reference* `volatile` or synchronizing access to the same, not directly replacing `final` with `volatile`. – shmosel Aug 30 '17 at 07:46
  • @shmosel i had the impression that you are talking about safe publication *and* how that is done via `final`; I'm sorry if I miss-understood that. – Eugene Aug 30 '17 at 08:20
  • @Eugene My point was that using `final` fields *within* the field (i.e. making it properly immutable) makes it unnecessary to enforce safe publication at the reference level. I hope we can agree on that. – shmosel Aug 30 '17 at 08:23
0

I've read your question a couple of times and still have some issues understating it. I will just try to augment the other answer - which is in fact correct. final in Java is all about re-orderings (or happens-before as you call it).

First this is guaranteed by the JLS, see Final Field Semantics. In the example you can see that a single final field is guaranteed to be correctly seen by other threads, while the other is not. The JLS is correct, but under the current implementation a single field to be final would be enough - read further.

Every single write to a final field inside a constructor is followed by two memory barriers - StoreStore and LoadStore (thus the name of happens-before); because the store to a final field will happen before read to the same field - guaranteed via memory barriers.

But the current implementation does not do that - meaning that the memory barriers do not happen after every single write to a final - they happen at the end of the constructor, see this. You will see this line that is important:

 _exits.insert_mem_bar(Op_MemBarRelease, alloc_with_final());

Op_MemBarRelease being the actual LoadStore|StoreStore barrier, thus under the current implementation it is enough to have a single field to be final for all others to be safely published as well. But, of course, do that at your own risk.

volatile in this context would not make publication enough - since it would simply not introduce the necessary barriers, you can read a bit more here.

Note that the reference publishing is not a concern, simply because the JLS says: Writes to and reads of references are always atomic, regardless of whether they are implemented as 32-bit or 64-bit values..

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • The main thing I was getting at was that the memory barrier you're talking about only guarantees that the values of the object's fields are visible to other threads. It does not guarantee that the value of a reference to the object is visible to other threads. (References being atomic also has nothing to do with their visibility.) I was correct about what happens if the reference is not safely published. I was wrong about unsafely published references always being wrong. – Kevin Krumwiede Aug 30 '17 at 08:58
  • @KevinKrumwiede ah! now it makes sense - you are talking about two different things `Safe Initialization` and `Safe Publication` here. Indeed for safe publication you would need the reference to be `volatile` for example. There is a lot more here : https://shipilev.net/blog/2014/safe-public-construction/ – Eugene Aug 30 '17 at 13:14
  • @KevinKrumwiede and btw, this would *still* be about memory barriers - as volatile would introduce the memory barriers needed for the actions done before a volatile write to be visible after the volatile read; thus making the reference `volatile` would be enough. – Eugene Aug 30 '17 at 13:17
  • Right. Part of my confusion was that some sources apparently use the term "safe publication" to mean safe initialization and *not* what I think of as safe publication. – Kevin Krumwiede Aug 30 '17 at 18:36
  • @KevinKrumwiede well in that case you should change the title of the question to reflect that - you could help a lot of people in the future searching for this exact things. thank you – Eugene Aug 31 '17 at 11:29