There are essentially two questions that you are asking:
1. Can the getInstance()
method return null
due to reordering?
(which I think is what you are really after, so I'll try to answer it first)
Even though I think designing Java to allow for this is totally insane, it seems like you are in fact correct that getInstance()
can return null.
Your example code:
if (resource == null)
resource = new Resource(); // unsafe publication
return resource;
is logically 100% identical to the example in the blog post you linked to:
if (hash == 0) {
// calculate local variable h to be non-zero
hash = h;
}
return hash;
Jeremy Manson then describes that his code can return 0 due to reordering. At first, I didn't believe it as I thought the following "happens-before"-logic must hold:
"if (resource == null)" happens before "resource = new Resource();"
and
"resource = new Resource();" happens before "return resource;"
therefore
"if (resource == null)" happens before "return resource;", preventing null
But Jeremy gives the following example in a comment to his blog post, how this code could be validly rewritten by the compiler:
read = resource;
if (resource==null)
read = resource = new Resource();
return read;
This, in a single-threaded environment, behaves exactly identically to the original code, but, in a multi-threaded environment might lead to the following execution order:
Thread 1 Thread 2
------------------------------- -------------------------------------------------
read = resource; // null
read = resource; // null
if (resource==null) // true
read = resource = new Resource(); // non-null
return read; // non-null
if (resource==null) // FALSE!!!
return read; // NULL!!!
Now, from an optimization-standpoint, doing this doesn't make any sense to me, since the whole point of these things would be to reduce multiple reads to the same location, in which case it makes no sense that the compiler wouldn't generate if (read==null)
instead, preventing the problem. So, as Jeremy points out in his blog, it is probably highly unlikely that this would ever happen. But it seems that, purely from a language-rules point of view, it is in fact allowed.
This example is actually covered in the JLS:
http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4
The effect observed between the values of r2
, r4
, and r5
in Table 17.4. Surprising results caused by forward substitution
is equivalent to what can happen with the read = resource
, the if (resource==null)
, and the return resource
in the example above.
Aside: Why do I reference the blog post as the ultimate source for the answer? Because the guy who wrote it, is also the guy who wrote chapter 17 of the JLS on concurrency! So, he better be right! :)
2. Would making Resource
immutable make the getInstance()
method thread-safe?
Given the potential null
result, which can happen independently of whether Resource
is mutable or not, the immediate simple answer to this question is: No (not strictly)
If we ignore this highly unlikely but possible scenario, though, the answer is: Depends.
The obvious threading-problem with the code is that it might lead to the following execution order (without any need for any reordering):
Thread 1 Thread 2
---------------------------------------- ----------------------------------------
if (resource==null) // true;
if (resource==null) // true
resource=new Resource(); // object 1
return resource; // object 1
resource=new Resource(); // object 2
return resource; // object 2
So, the non-thread-safety is coming from the fact that you might get two different objects back from the function (even though without reordering neither of them will ever be null
).
Now, what the book was probably trying to say is the following:
The Java immutable objects like Strings and Integers try to avoid creating multiple objects for the same content. So, if you have "hello"
in one spot and "hello"
in another spot, Java will give you the same exact object reference. Similarly, if you have new Integer(5)
in one spot and new Integer(5)
in another. If this were the case with new Resource()
as well, you would get the same reference back and object 1
and object 2
in the above example would be the exact same object. This would indeed lead to an effectively thread-safe function (ignoring the reordering problem).
But, if you implement Resource
yourself, I don't believe there is even a way to have the constructor return a reference to a previously created object rather than creating a new one. So, it should not be possible for you to make object 1
and object 2
be the exact same object. But, given that you are calling the constructor with the same arguments (none in both cases), it could be likely that, even though your created objects aren't the same exact object, they will, for all intents and purposes, behave as if they were, also effectively making the code thread-safe.
This doesn't necessarily have to be the case, though. Imagine an immutable version of Date
, for example. The default constructor Date()
uses the current system time as the date's value. So, even though the object is immutable and the constructor is called with the same argument, calling it twice will probably not result in an equivalent object. Therefore the getInstance()
method is not thread-safe.
So, as a general statement, I believe the line you quoted from the book is just plain wrong (at least as taken out of context here).
ADDITION Re: reordering
I find the resource==new Resource()
example a bit too simplistic to help me understand WHY allowing such reordering by Java would ever make sense. So let me see if I can come up with something where this would actually help optimization:
System.out.println("Found contact:");
System.out.println(firstname + " " + lastname);
if (firstname==null) firstname = "";
if (lastname ==null) lastname = "";
return firstname + " " + lastname;
Here, in the most likely case that both ifs
yield false
, it is non-optimal to do the expensive String concatenation firstname + " " + lastname
twice, once for the debug message, once for the return. So, it would indeed make sense here to reorder the code to do the following instead:
System.out.println("Found contact:");
String contact = firstname + " " + lastname;
System.out.println(contact);
if ((firstname==null) || (lastname==null)) {
if (firstname==null) firstname = "";
if (lastname ==null) lastname = "";
contact = firstname + " " + lastname;
}
return contact;
As examples get more complex and as you start thinking about the compiler keeping track of what is already loaded/computed in the processor registers that it uses and intelligently skipping re-calculation of already existing results, this effect might actually become more and more likely to happen. So, even though I never thought I would ever say this when I went to bed last night, thinking about it more, I do actually now believe that this may have been a needed/good decision to truly allow for code optimization to do its most impressive magic. But it does still strike me as quite dangerous as I don't think many people are aware of this and even if they are, it's quite complex to wrap your head around how to write your code correctly without synchronizing everything (which will then do away many times over with any performance benefits gained from more flexible optimization).
I guess if you didn't allow for this reordering, any caching and reuse of intermediate results of a series of process steps would become illegal, thus doing away with one of the most powerful compiler optimizations possible.