42

The code below (Java Concurrency in Practice listing 16.3) is not thread safe for obvious reasons:

public class UnsafeLazyInitialization {
    private static Resource resource;

    public static Resource getInstance() {
        if (resource == null)
            resource = new Resource();  // unsafe publication
        return resource;
    }
}

However, a few pages later, in section 16.3, they state:

UnsafeLazyInitialization is actually safe if Resource is immutable.

I don't understand that statement:

  • If Resource is immutable, any thread observing the resource variable will either see it null or fully constructed (thanks to the strong guarantees of final fields provided by the Java Memory Model)
  • However, nothing prevents instruction reordering: in particular the two reads of resource could be reordered (there is one read in the if and one in the return). So a thread could see a non null resource in the if condition but return a null reference (*).

I think UnsafeLazyInitialization.getInstance() can return null even if Resource is immutable. Is it the case and why (or why Not)?


(*) To better understand my point about reordering, this blog post by Jeremy Manson, who is one of the authors of the Chapter 17 of the JLS on concurrency, explains how String's hashcode is safely published via a benign data race and how removing the use of a local variable can lead to hashcode incorrectly returning 0, due to a possible reordering very similar to what I describe above:

What I've done here is to add an additional read: the second read of hash, before the return. As odd as it sounds, and as unlikely as it is to happen, the first read can return the correctly computed hash value, and the second read can return 0! This is allowed under the memory model because the model allows extensive reordering of operations. The second read can actually be moved, in your code, so that your processor does it before the first!

double-beep
  • 5,031
  • 17
  • 33
  • 41
assylias
  • 321,522
  • 82
  • 660
  • 783
  • It is so because if `Resource` is immutable, all instances of it are equivalent and even if two threads get in a race and get two different instances of `Resource`, functionally they are the same. – Abhinav Sarkar Jan 31 '13 at 11:19
  • @AbhinavSarkar I'm not asking if two threads could return different instances of resource - I'm asking whether a thread could return null. – assylias Jan 31 '13 at 11:21
  • If the instance was mutable, then its state could be modified, and the second null seeer could initialize a new instance with a pristine state. – Joop Eggen Jan 31 '13 at 11:25
  • I'd say you have found an erratum in JCIP. It should be corrected to use a local variable. – Marko Topolnik Jan 31 '13 at 17:16
  • 1
    Now that's what I call a **constructive** debate. – biziclop Feb 05 '13 at 21:32
  • email Brian Goetz and ask for a refund:) – irreputable Feb 06 '13 at 05:10
  • 1
    Woweee... Learned something today that made my head spin. I can't believe this is the way Java works. Awesome question! Thanks! – Markus A. Feb 10 '13 at 05:09

10 Answers10

3

UPDATE Feb10

I'm getting convinced that we should separate 2 phases: compilation and execution.

I think that the decision factor whether it is allowed to return null or not is what the bytecode is. I made 3 examples:

Example 1:

The original source code, literally translated to bytecode:

if (resource == null)
    resource = new Resource();  // unsafe publication
return resource;

The bytecode:

public static Resource getInstance();
Code:
0:   getstatic       #20; //Field resource:LResource;
3:   ifnonnull       16
6:   new             #22; //class Resource
9:   dup
10:  invokespecial   #24; //Method Resource."<init>":()V
13:  putstatic       #20; //Field resource:LResource;
16:  getstatic       #20; //Field resource:LResource;
19:  areturn

This is the most interesting case, because there are 2 reads (Line#0 and Line#16), and there is 1 write inbetween (Line#13). I claim that it is not possible to reorder, but let's examine it below.

Example 2:

The "complier optimized" code, which can be literally re-converted to java as follows:

Resource read = resource;
if (resource==null)
    read = resource = new Resource();
return read;

The byte code for that (actually I produced this by compiling the above code snippet):

public static Resource getInstance();
Code:
0:   getstatic       #20; //Field resource:LResource;
3:   astore_0
4:   getstatic       #20; //Field resource:LResource;
7:   ifnonnull       22
10:  new     #22; //class Resource
13:  dup
14:  invokespecial   #24; //Method Resource."<init>":()V
17:  dup
18:  putstatic       #20; //Field resource:LResource;
21:  astore_0
22:  aload_0
23:  areturn

It is obvious, that if the compiler "optimizes", and the byte code like above is produced, a null read can occur (for example, I refer to Jeremy Manson's blog)

It is also interesting to see that how a = b = c is working: the reference to new instance (Line#14) is duplicated (Line#17), and the same reference is stored then, first to b (resource, (Line#18)) then to a (read, (Line#21)).

Example 3:

Let's make an even slighter modification: read the resource only once! If the compiler starts to optimize (and using registers, as others mentioned), this is better optimization than above, because Line#4 here is a "register access" rather than a more expensive "static access" in Example 2.

Resource read = resource;
if (read == null)   // reading the local variable, not the static field
    read = resource = new Resource();
return read;

The bytecode for Example 3 (also created with literally compiling the above):

public static Resource getInstance();
Code:
0:   getstatic       #20; //Field resource:LResource;
3:   astore_0
4:   aload_0
5:   ifnonnull       20
8:   new     #22; //class Resource
11:  dup
12:  invokespecial   #24; //Method Resource."<init>":()V
15:  dup
16:  putstatic       #20; //Field resource:LResource;
19:  astore_0
20:  aload_0
21:  areturn

It is also easy to see, that it is not possible to get null from this bytecode since it is constructed the same way as String.hashcode(), having only 1 read of the static variable of resource.

Now let's examine Example 1:

0:   getstatic       #20; //Field resource:LResource;
3:   ifnonnull       16
6:   new             #22; //class Resource
9:   dup
10:  invokespecial   #24; //Method Resource."<init>":()V
13:  putstatic       #20; //Field resource:LResource;
16:  getstatic       #20; //Field resource:LResource;
19:  areturn

You can see that Line#16 (the read of variable#20 for return) most observe the write from Line#13 (the assignation of variable#20 from the constructor), so it is illegal to place it ahead in any execution order where Line#13 is executed. So, no reordering is possible.

For a JVM it is possible to construct (and take advantage of) a branch that (using certain extra conditions) bypasses the Line#13 write: the condition is that the read from variable#20 must not be null.

So, in neither case for Example 1 is possible to return null.

Conclusion:

Seeing the examples above, a bytecode seen in Example 1 WILL NOT PRODUCE null. An optimized bytecode like in Example 2 WILL PROCUDE null, but there is an even better optimization Example 3, which WILL NOT PRODUCE null.

Because we cannot be prepared for all possible optimization of all the compilers, we can say that in some cases it is possible, some other cases not possible to return null, and it all depends on the byte code. Also, we have shown that there is at least one example for both cases.


Older reasoning: Referring for the example of Assylias: The main question is: is it valid (concerning all specs, JMM, JLS) that a VM would reorder the 11 and 14 reads so, that 14 will happen BEFORE 11?

If it could happen, then the independent Thread2could write the resource with 23, so 14 could read null. I state that it is not possible.

Actually, because there is a possible write of 13, it would not be a valid execution order. A VM may optimize the execution order so, that excludes the not-executed branches (remaining just 2 reads, no writes), but to make this decision, it must do the first read (11), and it must read not-null, so the 14 read cannot precede the 11 read. So, it is NOT possible to return null.


Immutability

Concerning immutability, I think that this statement is not true:

UnsafeLazyInitialization is actually safe if Resource is immutable.

However, if the constructor is unpredictable, interesting results may come out. Imagine a constructor like this:

public class Resource {
    public final double foo;

    public Resource() {
        this.foo = Math.random();
    }
}

If we have tho Threads, it may result, that the 2 threads will receive a differently-behaving Object. So, the full statement should sound like this:

UnsafeLazyInitialization is actually safe if Resource is immutable and its initialization is consistent.

By consistent I mean that calling the constructor of the Resource twice we will receive two objects that behave exactly the same way (calling the same methods in the same order on both will yield the same results).

gaborsch
  • 15,408
  • 6
  • 37
  • 48
  • When thinking in terms of non-sychronized reordering you really should/can not consider multiple threads. The write that Thread-2 does at line 23 has no bearing what Thread-1 sees so long as Thread-1, in isolation, sees the same outcome 100% of the time. My answer illustrates why a thread can see a null value. – John Vint Feb 05 '13 at 20:19
  • As far as Immutability (to yours second point). If Resource is immutable then all defined fields are available after construction of the object (store-store load) which can only be achieved with final fields. And thus you don't run into racy field allocations – John Vint Feb 05 '13 at 20:20
  • @JohnVint In a single-threaded VM you're right, but in the current example it is unpredictable if 23 write would effect 14 read or not. Both cases are possible (because it's not synchronized). – gaborsch Feb 05 '13 at 20:27
  • I shouldve mentioned I still believe assylias' example is incorrect in terms of proof (which if true causes more confusion). I don't think it is possible to have that execution path occur. That is if `resource = null` is the default null write which happens-before publication – John Vint Feb 05 '13 at 20:28
  • @JohnVint An immutable object can still initialize its state in lazy manner (e.g. `String.hashcode`) - that's what I meant by *not initializing its state*. The word *safe* means for me that a *valid object is returned*. Look at the assembly code in this article: http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html - you can see that an uninitialized object simply can be returned for our method (the assignation to the field comes before the object initialisation). – gaborsch Feb 05 '13 at 20:34
  • `UnsafeLazyInitialization is actually safe if Resource instantiation does not initialize its internal state` Is false though because all final fields can be initialized and Resource here is safe to be published – John Vint Feb 05 '13 at 20:37
  • @JohnVint *can be initialized* does not mean *must be initialized*. As the previous link shows, without synchronisation an unitialised object can be returned. – gaborsch Feb 05 '13 at 20:42
  • What do you mean by uninitialized then? In the article he says in terms of what another thread may see, `see the default values for fields of the helper object, rather than the values set in the constructor.`. If they are final though (as JCiP indicates), this cannot happen. – John Vint Feb 05 '13 at 20:54
  • A fully unitialized object may still be safe (as you gave the example of the benign data race with String). It is still safe to publish though. The article you linked explains how a partially constructed object's fields can be seen as their default value (whether null, 0 or false) even though they are instantiated inline or in the constructor. – John Vint Feb 05 '13 at 21:00
  • @JohnVint If you initialize, you assign a value to final field(s). It takes time, so - having the reference (the example shows that another thread *may have* the reference) - another thread can access it even before *all the fields* got their values. So, you have got a partially initialized object. That's not safe. – gaborsch Feb 05 '13 at 21:12
  • 2
    @GaborSch I had assumed that is the conclusion you came to but it's unfortunately incorrect. Final fields in Java have special rules during object constructions. Take a look at http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.5. `when the object is seen by another thread, that thread will always see the correctly constructed version of that object's final fields` and `int i = f.x; // guaranteed to see 3 `. – John Vint Feb 05 '13 at 21:15
3

The confusion I think you have here is what the author meant by safe publication. He was referring to the safe publication of a non-null Resource, but you seem to get that.

Your question is interesting - is it possible to return a null cached value of resource?

Yes.

The compiler is allowed to reorder the operation like such

public static Resource getInstance(){
   Resource reordered = resource;
   if(resource != null){
       return reordered;
   }
   return (resource = new Resource());
} 

This doesn't violate the rule of sequential consistency but can return a null value.

Whether or not this is the best implementation is up for debate but there is no rules to prevent this type of reordering.

John Vint
  • 39,695
  • 7
  • 78
  • 108
  • @assylias Updated based on your link to Jeremy Manson – John Vint Jan 31 '13 at 16:42
  • Thank you for the update and +1 for the step by step example. – assylias Jan 31 '13 at 16:52
  • I don't believe this is a valid reordering because, simply, it violates the program order. This puts the read inside the conditional before the conditional, and the program establishes a happens-after relationship with respect to one thread executing these instructions. It is not true that any reordering at all of normal fields is OK; it has to respect the JLS / program order and this does not. I mean, right? think of the implications of this were true. You could make reads and writes happen any time? – Sean Owen Jan 31 '13 at 17:50
  • Reads and writes can happen any time so long as they maintain the same program order. Explain practically how this violates the program order? This is logically equivalent. Nothing happens after the conditional null check so how is this different? If response is not null then both `reordered` and `response` have the same values. – John Vint Jan 31 '13 at 18:25
  • The read that is used in the condition comes before the read for the return value in the original. Your change reorders them. This creates the problem that the variable appears to take on its previous value after the thread has observed a newer value. (This is when the first read is null but not the second.) that can't happen; more writes may intervene but unobserving a change can't be consistent with program order. I dont think JSR 133 allows this. Or else I can show that most threaded Java code is hopelessly meaningless. – Sean Owen Feb 01 '13 at 00:26
  • 1
    @SeanOwen I have read through the JLS specs, but *by JLS* nothing forbids to reorder two reads that read the same variable. Also, *by JLS* nothing forbids to reorder reads and writes (although this is nonsense). Please don't just *think* and argue on facts that you are not sure about, rather try to find some evidence for it. – gaborsch Feb 01 '13 at 09:24
  • @GaborSch Regarding *nothing forbids to reorder reads and writes (although this is nonsense)*: the main reason for the weak memory model is to allow optimisations - in particular (JLS): *The semantics of the Java programming language allow compilers and microprocessors to perform optimizations that can interact with incorrectly synchronized code in ways that can produce behaviors that seem paradoxical* and *If we were to use sequential consistency as our memory model, many of the compiler and processor optimizations that we have discussed would be illegal.* – assylias Feb 01 '13 at 09:56
  • Fair enough, I am not an expert, though I understand (I thought!) the memory model well. I am appealing to Java's as-if-serial semantics, not guessing. But, maybe I misinterpret it. This is a great discussion. I don't think you're suggesting that the model allows reordering load/store to one location, because that breaks even a single-threaded program. Here however we are talking about reordered reads only, yes. We're not even talking about observing several effects of other threads out of order -- I had thought -- because there is only one action in question in between. – Sean Owen Feb 01 '13 at 10:05
  • @assylias I didn't mean that *reordering* is nonsense, I meant (what you were referring also) that the *actual execution order* may be nonsense. Sorry if I wasn't clear. I totally agree with the reordering and optimisation, but I think that the JLS should have covered this read-write reordering issue. – gaborsch Feb 01 '13 at 10:10
  • @John I am confused because you are talking about reordering, but it's not just _reordering_, these are _optimizations_ which can completely obscure the program. For instance, in the example given by you, the resource assignment is done outside the if-body, which is not the case in the original program. When the compiler or runtime is allowed to do such things, then it becomes very hard to reason about thread safety. And that's exactly the point made by the author of the blog post, which was quoted by assylias. – proskor Feb 01 '13 at 11:50
  • 1
    @proskor `compiler or runtime is allowed to do such things, then it becomes very hard to reason about thread safety.` Yes it does! As a developer, when you execute your program in a single threaded environment you are guaranteed for it to execute as it appears when you wrote it, but that's it. Assylias did make the point, but his question was 'is it possible to return null'. And yes it is, that was my point I made. – John Vint Feb 01 '13 at 15:41
  • @proskor As far as the confusion, these re-ordering do obscure the program but you should never see that nor should care. I debugged a JIT compiler bug in which a local method store was used instead of the global field which really obscured the program but is fine under a single threaded envt. However, in this instance it violated a JMM rule for a multi threaded envt and was later fixed. http://stackoverflow.com/questions/10620680/why-volatile-in-java-5-doesnt-synchronize-cached-copies-of-variables-with-main/10621390#10621390 – John Vint Feb 01 '13 at 15:44
  • 2
    @SeanOwen, there is *no formal* guide how to compile java source to bytecode. In the face of lack of volatiles both java-source compiler and JVM can do any reordering as they please and it'd be legal as long as sequential execution meets the spec. Technically you get put a busy loop anywhere and it'd be legal. You mistake quality of implementation with JLS. – bestsss Feb 05 '13 at 11:44
  • 2
    @proskor, *but it's not just reordering, these are optimizations which can completely obscure the program* - obscure doesn't make it wrong, the JIT does a lot of optimizations and reordering if sees fit, the CPU routinely uses out of order execution and branch prediction. Alpha CPU actually can get guess a value and later check if the guess met the expectation... all that is fair game. This is the very reason JMM exists -- to prohibit such execution when necessary. – bestsss Feb 05 '13 at 11:50
3

After applying the JLS rules to this example, I have come to the conclusion that getInstance can definitely return null. In particular, JLS 17.4:

The memory model determines what values can be read at every point in the program. The actions of each thread in isolation must behave as governed by the semantics of that thread, with the exception that the values seen by each read are determined by the memory model.

It is then clear that in the absence of synchronization, null is a legal outcome of the method since each of the two reads can observe anything.


Proof

Decomposition of reads and writes

The program can be decomposed as follows (to clearly see the reads and writes):

                              Some Thread
---------------------------------------------------------------------
 10: resource = null; //default value                                  //write
=====================================================================
           Thread 1               |          Thread 2                
----------------------------------+----------------------------------
 11: a = resource;                | 21: x = resource;                  //read
 12: if (a == null)               | 22: if (x == null)               
 13:   resource = new Resource(); | 23:   resource = new Resource();   //write
 14: b = resource;                | 24: y = resource;                  //read
 15: return b;                    | 25: return y;                    

What the JLS says

JLS 17.4.5 gives the rules for a read to be allowed to observe a write:

We say that a read r of a variable v is allowed to observe a write w to v if, in the happens-before partial order of the execution trace:

  • r is not ordered before w (i.e., it is not the case that hb(r, w)), and
  • there is no intervening write w' to v (i.e. no write w' to v such that hb(w, w') and hb(w', r)).

Application of the rule

In our example, let's assume that thread 1 sees null and properly initialises resource. In thread 2, an invalid execution would be for 21 to observe 23 (due to program order) - but any of the other writes (10 and 13) can be observed by either read:

  • 10 happens-before all actions so no read is ordered before 10
  • 21 and 24 have no hb relationship with 13
  • 13 does not happens-before 23 (no hb relationship between the two)

So both 21 and 24 (our 2 reads) are allowed to observe either 10 (null) or 13 (not null).

Execution path that returns null

In particular, assuming that Thread 1 sees a null on line 11 and initialises resource on line 13, Thread 2 could legally execute as follows:

  • 24: y = null (reads write 10)
  • 21: x = non null (reads write 13)
  • 22: false
  • 25: return y

Note: to clarify, this does not mean that T2 sees non null and subsequently sees null (which would breach the causality requirements) - it means that from an execution perspective, the two reads have been reordered and the second one was committed before the first one - however it does look as if the later write had been seen before the earlier one based on the initial program order.

UPDATE 10 Feb

Back to the code, a valid reordering would be:

Resource tmp = resource; // null here
if (resource != null) { // resource not null here
    resource = tmp = new Resource();
}
return tmp; // returns null

And because that code is sequentially consistent (if executed by a single thread, it will always have the same behaviour as the original code) it shows that the causality requirements are satisfied (there is a valid execution that produces the outcome).


After posting on the concurrency interest list, I got a few messages regarding the legality of that reordering, which confirm that null is a legal outcome:

  • The transformation is definitely legal since a single-threaded execution won't tell the difference. [Note that] the transformation doesn't seem sensible - there's no good reason a compiler would do it. However, given a larger amount of surrounding code or perhaps a compiler optimization "bug", it could happen.
  • The statement about intra-thread ordering and program order is what made me question the validity of things, but ultimately the JMM relates to the bytecode that gets executed. The transformation could be done by the javac compiler in which case null will be perfectly valid. And there are no rules for how javac has to convert from Java source to Java bytecode so...
assylias
  • 321,522
  • 82
  • 660
  • 783
  • This is great analysis, but: here's my central point (perhaps of confusion). The JLS requires more than happens-before consistency. Right? See Example 17.4.8-1. Happens-before Consistency Is Not Sufficient. Not violating happens-before consistency (and yeah I think it doesn't) is not all that is needed. But honestly the more I read this part of the JLS the less I am sure of what is required! – Sean Owen Feb 01 '13 at 18:58
  • +1, I believe this is the correct answer. See this [example](http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.5-600) that is similar. The problem is that the read of resource does *not* establish a happens-before relationship with the write, i.e. the fact that resource is non-null at `x = resource` does not establish `hb(write at 13, read at 21)` so `hb(read at 21, read at 24)` does not imply `hb(write at 13, read at 24)`, anything can be read. – Alex DiCarlo Feb 01 '13 at 20:33
  • 1
    @assylias the only reason I am not quite sold on this is because (and I have to find proof of this) you cannot go 'back in time' and read an older value. That is, if resources was initially non-null then I don't think the JMM supports the scenario where resources will be an older value (which is null). In fact the link you pointed to specifically says the second read can be done before the first (which implies more of my example). – John Vint Feb 01 '13 at 21:56
  • @JohnVint One way of looking at it is that we have an additional line before all the others `20: r = resource` and `24: y = r` and the rest the same as @assylias's example. In this case, `r` could be null, while `x` is non-null. While this is unlikely, the Java memory model specifically states that this is possible. – Alex DiCarlo Feb 01 '13 at 22:13
  • 1
    This is crazy! Suppose there are two boolean variables `x` and `y`, which are initially set to `true`, and two threads: thread 1: `while(x); y = false;` and thread 2: `while(y); x = false;`. Obviously, none of the threads can possibly terminate, because both x and y are set to true. But, by your reasoning, it _is_ possible! One thread only needs to see the write "in the future" by the other thread, which is allowed since there is no _happens-before_ between them, and therefore they can appear out of order to other threads. But then my question is: which thread terminated first? – proskor Feb 02 '13 at 00:47
  • @proskor I believe this is a different example. That would fall in the ["out-of-thin-air" category](http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#d5e28616) and would not be a valid execution (assuming the initial `true` values are properly published). In particular, there is no sequential execution that would let either loop exit. In contrast, (I think) the proof I gave works because there is no write in thread 2. But it seems that there is a flaw in my analysis (cf previous comment) so I am a bit lost here! ;-) JMM without synchronization is a nightmare! – assylias Feb 02 '13 at 00:52
  • 1
    But if that is your argument - there is no sequential execution that would let the loop exit, therefore it is not allowed - then it would also invalidate your proof, because analogously there is no sequential execution resulting in a null pointer being returned either. There is no way resource can go back to null after it was read as not-null. – proskor Feb 02 '13 at 01:24
  • @assylias Is this related to [JLS 17.4.4]? (The write of the default value (zero, false, or null) to each variable synchronizes-with the first action in every thread.) – ben ben Feb 04 '13 at 08:41
  • @benben That is fine: it means the null value is observable by T1 and T2. – assylias Feb 04 '13 at 08:43
  • @assylias is it ok to interpret "new Resource()" as assignment of an initial value? – ben ben Feb 04 '13 at 10:35
  • 1
    @benben no it isn't. Which is why this example is in JCiP, an assignment inline with object construction may or may not be available to other threads that reads that object as instantiated. – John Vint Feb 04 '13 at 14:36
  • 1
    @JohnVint FYI: http://cs.oswego.edu/pipermail/concurrency-interest/2013-February/date.html – assylias Feb 04 '13 at 14:59
  • @assylias Look at that, so I guess my example is technically correct (though written differently). Good question assylias and nice follow up! – John Vint Feb 04 '13 at 15:07
  • 1
    I enjoy David Holmes explanation of "The transformation does not seem sensible but that doesn't mean it is prohibited." – John Vint Feb 04 '13 at 15:11
  • 1
    @JohnVint or *"Establishing the legality, or not, of such a transformation is very difficult"*... – assylias Feb 04 '13 at 15:14
  • 2
    `Resource resource` and racing, I guess B. Goetz position would be very JVM dependent. If the JVM decides to actually uses already zero'd memory (w/ write barrier,hence doesn't need to do any init code) but still decides to set "Resource=null" in the clinit method but w/o any barrier, it could potentially do so, if there is no trap for the class resolution in another thread/code. I still can't see how it can race and the JVM to be implemented correctly. – bestsss Feb 05 '13 at 11:59
  • 1
    Quote David Holmes: "The JCiP example is not considering null as a possible outcome and is only looking at the safe-publication aspect." Here he is a) confirming your hypothesis that `null` can be returned and b) interpreting the JCIP statement as not pertaining to that fact. He is right, but that JCIP statement is clearly not didactically sound and should still be corrected/clarified, in my opinion. – Marko Topolnik Feb 05 '13 at 21:29
  • @MarkoTopolnik Thank's for the references, now what do we do with all the answers that say it cannot return null? :) – John Vint Feb 05 '13 at 21:46
  • 1
    @JohnVint Deleting them would be a tasteful choice. :) – Marko Topolnik Feb 05 '13 at 21:56
  • @JohnVint I had not understood your original comment initially - I have added a final note which I think brings clarification. – assylias Feb 05 '13 at 22:07
  • 1
    As far as causality, the point is that actions must be committed in some definite order. In the JLS example, this prevents the situation where the `write x` action is committed both before and after `write y`. In our example, we commit the second (in program order) read before the first read and the write; this gives causal consistency and doesn't create any causality paradoxes (loops). – Marko Topolnik Feb 05 '13 at 22:44
  • 2
    Maybe I will not be popular, but I still don't believe that this "reordering" is valid. IMHO this is not *reorder*, but rather a *program transformation*, which is not the same. A **new variable** is introduced here, which is **not present** in the original code. I could rewrite this function in 10 different ways, all of them would do the same (in single-threaded model, but behave differently in the multithreaded model), but we have to examine the original one. I think that there is a break in the happens-before consistency. – gaborsch Feb 10 '13 at 17:51
  • @GaborSch: Suppose code read `thing1.resource`, `thing2.resource`, and `thing1.resource`, with no intervening method calls or writes to any object's `.resource` field. Would it be unreasonable or illegitimate for the compiler to have the third read use the value retrieved by the first one? If `thing1` and `thing2` are the same object, that would move the third read before the second. – supercat Feb 08 '14 at 21:58
  • @supercat No, we cannot make such kind of assumptions. It is the programmers' job to create thread-safe code, not the compilers' to figure out what the programmer really wanted. – gaborsch Feb 09 '14 at 00:42
  • If you only have a *single* thread (say T2), which is entering this function for the first time ever, and `24` is reordered to happen before `21` (meaning both `x` and `y` are `null` before `22`, wouldn't this code always return `null` as result (since `y` was read at the top)? – vgru Aug 27 '14 at 13:27
  • @Groo no, the execution must be consistent with the program order for a given thread. – assylias Aug 27 '14 at 13:41
  • @assylias: I know it has to be consistent, that's why I don't understand how moving `24` to the top is a valid transformation? I.e. wouldn't reading `y` only once, at the top of the method, make it always return the `null` value on the first access (even in a single thread)? I am pretty sure I am missing something, but I just can't see what. – vgru Aug 27 '14 at 13:51
  • @assylias: or, alternatively, doesn't `W(resource)` in line `23` have a *happens-before* relationship with `r(resource)` at line `24`, which would be broken by such transformation? – vgru Aug 27 '14 at 13:54
  • @Groo I see what you are saying - the example would have been better with `y = resource` between 21 and 22, and `23: y = resource = new Resource()` - then the reordering is consistent with the program order. – assylias Aug 28 '14 at 13:31
2

There are essentially two questions that you are asking:

1. Can the getInstance() method return null due to reordering?

(which I think is what you are really after, so I'll try to answer it first)

Even though I think designing Java to allow for this is totally insane, it seems like you are in fact correct that getInstance() can return null.

Your example code:

if (resource == null)
    resource = new Resource();  // unsafe publication
return resource;

is logically 100% identical to the example in the blog post you linked to:

if (hash == 0) {
    // calculate local variable h to be non-zero
    hash = h;
}
return hash;

Jeremy Manson then describes that his code can return 0 due to reordering. At first, I didn't believe it as I thought the following "happens-before"-logic must hold:

   "if (resource == null)" happens before "resource = new Resource();"
                                   and
     "resource = new Resource();" happens before "return resource;"
                                therefore
"if (resource == null)" happens before "return resource;", preventing null

But Jeremy gives the following example in a comment to his blog post, how this code could be validly rewritten by the compiler:

read = resource;
if (resource==null)
    read = resource = new Resource();
return read;

This, in a single-threaded environment, behaves exactly identically to the original code, but, in a multi-threaded environment might lead to the following execution order:

Thread 1                        Thread 2
------------------------------- -------------------------------------------------
read = resource;    // null
                                read = resource;                      // null
                                if (resource==null)                   // true
                                    read = resource = new Resource(); // non-null
                                return read;                          // non-null
if (resource==null) // FALSE!!!
return read;        // NULL!!!

Now, from an optimization-standpoint, doing this doesn't make any sense to me, since the whole point of these things would be to reduce multiple reads to the same location, in which case it makes no sense that the compiler wouldn't generate if (read==null) instead, preventing the problem. So, as Jeremy points out in his blog, it is probably highly unlikely that this would ever happen. But it seems that, purely from a language-rules point of view, it is in fact allowed.

This example is actually covered in the JLS:

http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4

The effect observed between the values of r2, r4, and r5 in Table 17.4. Surprising results caused by forward substitution is equivalent to what can happen with the read = resource, the if (resource==null), and the return resource in the example above.

Aside: Why do I reference the blog post as the ultimate source for the answer? Because the guy who wrote it, is also the guy who wrote chapter 17 of the JLS on concurrency! So, he better be right! :)

2. Would making Resource immutable make the getInstance() method thread-safe?

Given the potential null result, which can happen independently of whether Resource is mutable or not, the immediate simple answer to this question is: No (not strictly)

If we ignore this highly unlikely but possible scenario, though, the answer is: Depends.

The obvious threading-problem with the code is that it might lead to the following execution order (without any need for any reordering):

Thread 1                                 Thread 2
---------------------------------------- ----------------------------------------
if (resource==null) // true;  
                                         if (resource==null)          // true
                                             resource=new Resource(); // object 1
                                         return resource;             // object 1
    resource=new Resource(); // object 2
return resource;             // object 2

So, the non-thread-safety is coming from the fact that you might get two different objects back from the function (even though without reordering neither of them will ever be null).

Now, what the book was probably trying to say is the following:

The Java immutable objects like Strings and Integers try to avoid creating multiple objects for the same content. So, if you have "hello" in one spot and "hello" in another spot, Java will give you the same exact object reference. Similarly, if you have new Integer(5) in one spot and new Integer(5) in another. If this were the case with new Resource() as well, you would get the same reference back and object 1 and object 2 in the above example would be the exact same object. This would indeed lead to an effectively thread-safe function (ignoring the reordering problem).

But, if you implement Resource yourself, I don't believe there is even a way to have the constructor return a reference to a previously created object rather than creating a new one. So, it should not be possible for you to make object 1 and object 2 be the exact same object. But, given that you are calling the constructor with the same arguments (none in both cases), it could be likely that, even though your created objects aren't the same exact object, they will, for all intents and purposes, behave as if they were, also effectively making the code thread-safe.

This doesn't necessarily have to be the case, though. Imagine an immutable version of Date, for example. The default constructor Date() uses the current system time as the date's value. So, even though the object is immutable and the constructor is called with the same argument, calling it twice will probably not result in an equivalent object. Therefore the getInstance() method is not thread-safe.

So, as a general statement, I believe the line you quoted from the book is just plain wrong (at least as taken out of context here).

ADDITION Re: reordering

I find the resource==new Resource() example a bit too simplistic to help me understand WHY allowing such reordering by Java would ever make sense. So let me see if I can come up with something where this would actually help optimization:

System.out.println("Found contact:");
System.out.println(firstname + " " + lastname);
if (firstname==null) firstname = "";
if (lastname ==null) lastname  = "";
return firstname + " " + lastname;

Here, in the most likely case that both ifs yield false, it is non-optimal to do the expensive String concatenation firstname + " " + lastname twice, once for the debug message, once for the return. So, it would indeed make sense here to reorder the code to do the following instead:

System.out.println("Found contact:");
String contact = firstname + " " + lastname;
System.out.println(contact);
if ((firstname==null) || (lastname==null)) {
    if (firstname==null) firstname = "";
    if (lastname ==null) lastname  = "";
    contact = firstname + " " + lastname;
}
return contact;

As examples get more complex and as you start thinking about the compiler keeping track of what is already loaded/computed in the processor registers that it uses and intelligently skipping re-calculation of already existing results, this effect might actually become more and more likely to happen. So, even though I never thought I would ever say this when I went to bed last night, thinking about it more, I do actually now believe that this may have been a needed/good decision to truly allow for code optimization to do its most impressive magic. But it does still strike me as quite dangerous as I don't think many people are aware of this and even if they are, it's quite complex to wrap your head around how to write your code correctly without synchronizing everything (which will then do away many times over with any performance benefits gained from more flexible optimization).

I guess if you didn't allow for this reordering, any caching and reuse of intermediate results of a series of process steps would become illegal, thus doing away with one of the most powerful compiler optimizations possible.

Markus A.
  • 12,349
  • 8
  • 52
  • 116
  • Yes this is a good comprehensive discussion, but it still leaves me puzzled, because the JLS says that happens-before consistency is not sufficient (17.4.8-1), and suggests that sequential consistency is needed, but doesn't mention sequential consistency in 17.4.7, and says it would be restrictive in 17.4.3. But it would forbid the out-of-order read. – Sean Owen Feb 10 '13 at 16:48
  • PS I am still not sure how this reordering is happens-before consistent. 17.4.5 definitely says that when an action x comes before an action y in the same thread in program order, x happens-before y. Here the reads then seem to have a happens-before relationship violated by the reordering. – Sean Owen Feb 10 '13 at 16:50
  • 1
    @SeanOwen I am still wondering about the sequential consistency as well, so I don't think I'll be able to give you a good answer there. But for your second question, I think a statement from a little lower in 17.4.5 answers it: `It should be noted that the presence of a happens-before relationship between two actions does not necessarily imply that they have to take place in that order in an implementation. If the reordering produces results consistent with a legal execution, it is not illegal.` If you ask me, all this allowed reordering is a bit crazy, but that's the way they chose to do it. – Markus A. Feb 10 '13 at 17:47
  • @SeanOwen I can't find the exact place right now, but I read in one of these documents that it is definitely legit for two reads to the same memory location to be swapped, if they are not separated by a write in the same thread. So, if you look at the two possible execution paths of the modified code as two separate programs entirely, you get `read=resource`, `if (resource==null)`, `return read` in one case (the read-swap is legal as there is no write in between) and `if (resource==null)`, `read=resource=new Resource()`, `return read` in the other case, which doesn't even involve any swaps. – Markus A. Feb 10 '13 at 17:50
  • @SeanOwen I guess the reasoning behind things can begin to make a bit more sense if you start thinking about actual machine code execution. Then, you can think of the `read` variable in the example above as a processor register (`EAX`, for example). Now, in more complex code, it might make sense that you still have `EAX` loaded from what happened earlier (here: `read=resource`), but, for a more complex `if`-condition, the compiler decides it is better to do something involving a re-read of the same memory location to preserve states of other registers that will be needed again later... – Markus A. Feb 10 '13 at 17:56
  • @SeanOwen I added a code example to my answer that (at least to me) motivates why Java would allow to do such a thing. Hope this helps! :) – Markus A. Feb 10 '13 at 18:17
  • @GaborSch If I understand correctly, he says that what Buu wrote is **not equivalent code** because he forgot to change a line inside the if statement as well: the line `hash = something` will need to be changed to `h = hash = something`. Just like in our case, the `resource = new Resource()` becomes `read = resource = new Resource()`. Then it IS equivalent. (*"...the equivalent code has another write to h before the end of the if statement"*) – Markus A. Feb 10 '13 at 18:36
  • @MarkusA. An interesting side-note, that Jeremy stated that *The processor wouldn't do this kind of transformation, though, the compiler would* - so, this **must be in the bytecode**, the **JVM would not reorder** this way of thing by itself. So, given a class file, with the bytecode *actually direct transformation of the original source code* **this reorder cannot happen**. – gaborsch Feb 10 '13 at 18:45
  • @GaborSch I'm not sure what he means by *"the processor"* and *"the compiler"*. Some JVMs have Just-In-Time compilers in them that optimize the byte code for the platform on which it will run. So, in this set of 3 (the compiler Java to byte code, the JVM (including JIT), and the actual CPU), I think he is only saying that the CPU of the target machine would not do this reordering, but the compiler or the JVM might. This is just my understanding, though, and I'm not 100% sure on this. E.g. some optimizations only work for certain numbers of processor registers, which is platform dependent. – Markus A. Feb 10 '13 at 18:48
  • @MarkusA. I updated my answer. I examined the byte code produced by the compiler, and made some statements, covering even this topic. – gaborsch Feb 10 '13 at 21:57
  • @GaborSch Are you sure, though, that the JIT inside the JVM won't make any further changes and reorderings to the code? Also, different compilers (or compiler flags) might lead to different byte code, no? – Markus A. Feb 10 '13 at 21:58
  • @MarkusA. The JIT is working with the compiled bytecode. E.g. if the 3rd bytecode is given, in no way you will get `null` (I can't estimate the number of `String.hashcode` calls, but it has never returned 0). Also, a byte code with reordered reads will surely produce `null`. My statement is that it all depends on the bytecode. We can argue that the first example returns null or not, but I think that the question was about a piece of `java code`, and for that, the answer is: **depends on the byte code**. – gaborsch Feb 10 '13 at 22:10
  • 1
    @GaborSch Hmmm... I usually view the byte-code as just an intermediate step in the full compilation process, which consists of the work of javac (or similar) and the JVM's JIT. Optimizations can happen anywhere along this chain, not only in the first step. So, I'm not sure that simply looking at the byte-code instead of the original source will help much. Clearly, it'll tell you if something bad already happened in the javac-step, but it won't necessarily tell you that nothing bad will happen later in the JIT step. If javac was allowed to introduce problematic optimizations, so is the JIT... – Markus A. Feb 10 '13 at 22:18
  • @MarkusA. You are right, it is a long chain from writing optimized source code until how the registers are handled in a RISC processor (for example). I only state that the "first step" (compilation) may decide the way how your code is executed. Probably there are cases when it is not obvious, or depends on the succeeding steps (maybe Example 1 is such), but it is irrelevant from the questions' POV. – gaborsch Feb 10 '13 at 22:29
  • @GaborSch I just don't think there's much of a difference in what is allowed to happen in the compilation from java-code to byte-code versus the JIT-compilation from byte-code to machine-code. So, I'm not sure it answers the question usefully. That's like someone asking: "Can I drive from home to work (20 mi away) without ever having to take a left turn?" and answering that with: "You have to check if you can drive the first 10 miles without taking a left turn!" If the answer is "No", it is indeed an answer, but if it's "Yes" you haven't learned anything. – Markus A. Feb 10 '13 at 22:39
  • @MarkusA. Yes, that's a decisive point in my opinion, and if you take a closer look at the examples, you'll see the point. This is not a "driving home" question, but let's play with it a bit: you have to drive home, after 10 miles you are in a crossing where you have 3 options: 1) you go on the highway which will **surely** take you home without left turn, 2) you go on a road which **has** a left turn, or 3) go a third road, which may be good, but may be wrong, too. After you decided to take #1, #2 or #3, your way is **determined** by your decision. – gaborsch Feb 10 '13 at 22:53
  • @GaborSch I'm just not sure that the JIT-compiler is actually more restricted by what reordering it is allowed to do to the byte-code than the java compiler is restricted as to what reordering it can do to your java source. So, whether the java compiler introduces a problem doesn't really tell you about whether the JIT compiler might introduce (or fix!?!) one later. Just like the first 10 miles of driving don't tell you anything about the second 10 miles. – Markus A. Feb 10 '13 at 23:24
  • OK, let's forget about this driving example, return to the java code. Tell me then a concrete example how the JIT could "break" the code in *Example 3*. If you can, I entitle you to reimplement `String.hashcode()` in a safer way. – gaborsch Feb 10 '13 at 23:48
  • @GaborSch Agreed, but in your *Example 3* (and in String.hashcode) the JAVA SOURCE is already written so that neither the java compiler nor the JIT may break it. All I'm saying is that you can't just look at the byte-code and come up with a definitive answer in **all cases** (your *Examples 1 and 2*). And in the cases where you can, you can make the same argument simply based on Java source (just like you did). **Subjectively**, I find your analysis interesting, but over-complicated. But others might find it very enlightening. So I'm happy that your answer is available here as well. – Markus A. Feb 11 '13 at 00:07
  • 1
    let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/24295/discussion-between-markus-a-and-gaborsch) – Markus A. Feb 11 '13 at 00:12
0

Nothing sets the reference to null once it is non-null. It's possible for a thread to see null after another thread has set it to non-null but I don't see how the reverse is possible.

I'm not sure instruction re-ordering is a factor here, but interleaving of instructions by two threads is. The if branch can't somehow be reordered to execute before its condition has been evaluated.

Sean Owen
  • 66,182
  • 23
  • 141
  • 173
  • 3
    Why could the reads of `resource` in `return resource;` and in `if (resource == null)` not be reordered? – assylias Jan 31 '13 at 11:25
  • 3
    It's also explained in the book, chapter 16.1.3: "**Program order rule**. Each action in a thread _happens-before_ every action in that thread that comes later in the program order." Of course, reads of `resource` can be interleaved by multiple threads, but there is no way a thread can return null, because in every single thread the `if`-part is always evaluated before the return statement. – proskor Jan 31 '13 at 13:02
  • 1
    @proskor How do you explain the example given in the blog post I linked to then? Also I don't think your interpretation of the program order rule is accurate: see for example 16.1.2 and the exaplanation of listing 16.1. Actions of one thread can be reordered from the perspective of another thread. – assylias Jan 31 '13 at 13:09
  • 2
    @proskor Happens before has a very specific meaning in the JLS - [in particular](http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.5-210): *if two actions share a happens-before relationship, they do not necessarily have to appear to have happened in that order to any code with which they do not share a happens-before relationship.* – assylias Jan 31 '13 at 13:17
  • Well, I've read the blog post and the whole discussion thread for at least three times. I admit that I was not aware of the problem Jeremy described, but honestly, I am still not convinced. I don' think that reordering is an issue. And since the answer to the last post by Vladimir is very cryptic: "That sentence refers to the semantics of the evaluation of the thread, not the actual evaluation of the thread by the JVM.", I suspect that something is not right there. Especially since the given example in the last post has a completely different structure than the hashCode example. – proskor Jan 31 '13 at 14:44
  • 1
    I don't think that the two reads of resource can ever be reordered, because there is a write to it in between. For the same reason, I don't think that the two reads of hash in the blog post can be reordered too. Image the condition when resource is actually null, then there is no way to execute the second "return"-read before evaluating the if-expression. – proskor Jan 31 '13 at 14:56
  • I also do not think it is a problem. The logic is this: the first read of resource is not guaranteed to happen-after another thread's write of it. But, it *will* either happen-after, or it won't. If the read is not `null`, then the other thread's write happened-before, in this case. The other statements in the method, to the current thread, happen-after the first statement (first read). So they happen-after an assignment to non-`null`. So it must return non-`null`. – Sean Owen Jan 31 '13 at 14:58
  • @proskor I understand your reasoning, but since the guy who wrote that post also wrote the specs of the language, I would tend to believe him. – assylias Jan 31 '13 at 15:06
  • @proskor Look at this article, this will give you a hint: http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html It also digs until assembly level. Doe not specifically cover *that* reordering we are curious about, but reading that you can imagine how it can work in our case. – gaborsch Jan 31 '13 at 16:30
  • 1
    @assylias Well said. For those who don't think it is an issue look at http://g.oswego.edu/dl/jmm/cookbook.html. The grid says a Normal-Load can be reordered in front of a Normal-Store (so long as it maintains program order). There is nothing preventing the resource field to be read and cached first then returned if the initial predicate is true. There is no happens-before relationship here. – John Vint Jan 31 '13 at 16:45
  • Here, putting the load in front of the store violates program order. There is a store, because the predicate is true. In fact, this resource already points out that you can't reorder a load to come before a store to the same location. – Sean Owen Jan 31 '13 at 17:34
  • @SeanOwen What difference in the application occurs if you order the load before the store? Nothing. If you stored the value into a method local field and evaluated the global field again they will have the same value in a single threaded application. And to answer your second point, you can the first empty box displays you can move loads and stores ahead of each other if it maintains program order. – John Vint Jan 31 '13 at 17:47
  • Here, without the reordering, this can't return `null`. Otherwise it can. In a single-threaded application, this changes behavior. The `!= null` branch of your example is not the problem. It's the other one. If `resource = new ...` you can't `return reordered` just after, which is I assume the rewrite you are implying. – Sean Owen Jan 31 '13 at 17:59
  • I'm referring to the passage, "Blank cells in the table mean that the reordering is allowed if the accesses aren't otherwise dependent with respect to basic Java semantics (as specified in the JLS). For example even though the table doesn't say so, you can't reorder a load with a subsequent store to the same location." Or else... what is the limit you think exists? There has to be some or I can just say every write can happen anywhere, which can't be so. Surely it must be constrained by the JLS and, well, the program? – Sean Owen Jan 31 '13 at 18:00
  • @SeanOwen thank you for the bounty - I was actually hesitating to place one! Have you seen the link to concurrency interest I posted earlier? – assylias Feb 04 '13 at 23:39
0

I'm sorry if I'm wrong (because I'm not native-English speaker), but it seems to me, that mentioned statement:

UnsafeLazyInitialization is actually safe if Resource is immutable.

is torn out of the context. This statement is truly regarding to use initialization safety:

The guarantee of initialization safety allows properly constructed immutable objects to be safely shared across threads without synchronization

...

Initialization safety guarantees that for properly constructed objects, all threads will see the correct values of final fields that were set by the constructor

Andremoniy
  • 34,031
  • 20
  • 135
  • 241
  • That would make sense indeed - so if I understand well, you agree that a thread could return null from `getInstance` (but a thread could not return a partially constructed object because of immutability). – assylias Jan 31 '13 at 11:57
  • 2
    Yes I agree. But I don't agree, that authors of book meant, that you could just leave code in listing 16.3 and do not worry, if `Resource` is immutable. – Andremoniy Jan 31 '13 at 12:02
0

After reading through the post you linked more carefully, you are correct, the example you posted could conceivably (under the current memory model) return null. The relevant example is way down in the comments of the post, but effectively, the runtime can do this:

public class UnsafeLazyInitialization {
    private static Resource resource;

    public static Resource getInstance() {
        Resource tmp = resource;
        if (resource == null)
            tmp = resource = new Resource();  // unsafe publication
        return tmp;
    }
}

This obeys the constraints for a single-thread, but could result in a null return value if multiple threads are calling the method (the first assignment to tmp gets a null value, the if block sees a non-null value, tmp gets returned as null).

In order to make this "safely" unsafe (assuming Resource is immutable), you have to explicitly read resource only once (similar to how you should treat a shared volatile variable:

public class UnsafeLazyInitialization {
    private static Resource resource;

    public static Resource getInstance() {
        Resource cur = resource;
        if (cur == null) {
            cur = new Resource();
            resource = cur;
        }
        return cur;
    }
}
jtahlborn
  • 52,909
  • 5
  • 76
  • 118
  • @assylias - what does "happens before" have to do with anything? in no part of the code does instance get explicitly set to null by _any_ thread. how can re-ordering of _any_ thread make that or any other thread suddenly see a null value? – jtahlborn Jan 31 '13 at 16:33
  • @assylias - null is the initial value which is set in a _thread-safe_ manner by class initialization. that initial setting is guaranteed to be correctly visible to all threads (i.e. for all intents and purposes that is set by a single thread before _any other thread_ sees that Class). again, that doesn't affect this scenario at all. – jtahlborn Jan 31 '13 at 16:37
  • removed the comments since it was already discussed in the other thread. – assylias Jan 31 '13 at 17:00
  • I think `cur = resource = new Resource();` should be `resource = cur = new Resource();` to err on the safe side. Non synchronized concurrent stuff is a real headhache... – assylias Jan 31 '13 at 17:02
  • 1
    @assylias - i hate multi-assign anyway, made it explicit. – jtahlborn Jan 31 '13 at 17:04
  • @assylias fortunately there's no err here - `a=b=c` does not contain a read of `b`; the value of `c` is directly assigned to `a`, not through `b`. – irreputable Feb 06 '13 at 05:09
  • 1
    @irreputable According to the jls there is a read of b: http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.26 – assylias Feb 06 '13 at 08:29
  • 1
    @assylias - yes the spec isn't very clear, but see http://cs.oswego.edu/pipermail/concurrency-interest/2013-January/010602.html and http://stackoverflow.com/questions/12850676/return-value-of-assignment-operator – irreputable Feb 06 '13 at 17:04
  • @assylias - note how Doug Lea quotes the same section for the opposite conclusion. I guess the the value of (b=c) is stored in a temp local place, must like the value of a method invocation, then that value is assigned to a, therefore avoiding a read at b. – irreputable Feb 06 '13 at 17:09
  • @irreputable this is confusing. *"a=b=c means a=(b=c), which assigns the value of c to b and then assigns the value of b to a."* seemed fairly clear to me... Thanks for the link. – assylias Feb 06 '13 at 17:40
0

This is now a very long back thread, still given this question discusses many interesting workings of re-ordering and concurrency, I am involving here by though lately.

For a moment, if we do not involve concurrency, the actions and valid reorderings in multi-threaded situation.
"Can JVM use a cached value post write operation in single-thread context". I think no. Given there is a write operation in if condition can caching come in to play at all.
So back to the question, immutability ensure that the object is fully or correctly created before it's reference is accessible or published, so immutability definitely helps. But here there is a write operation after the object creation. So can the second read cache the value from pre-write, in the same thread or another. No. One thread might not know about the write in other thread (given there is no need for immediate visibility between threads). So won't the possibility of returning a false null (i.e after the object creation) be invalid. ( The code in question breaks singleton, but we are not bothered about the here)

samshers
  • 1
  • 6
  • 37
  • 84
-1

It is indeed safe is UnsafeLazyInitialization.resource is immutable, i.e. the field is declared as final:

private static final Resource resource = new Resource();

It might also be considered as thread-safe if the Resource class itself is immutable and does not matter which instance you are using. In that case two calls could return different instances of Resource without issue apart from an increased memory consumption depending on the number of threads calling getInstance() at the same time).

It seems far-fetched though and I believe there is a typo, real sentence should be

UnsafeLazyInitialization is actually safe if *r*esource is immutable.

Guillaume
  • 5,535
  • 1
  • 24
  • 30
  • I don't think there is a typo because in the first example you give there is no need to make `resource` final to ensure safe publication. – assylias Jan 31 '13 at 11:30
  • It is not strictly requested but I would not go without the final: it is a simple security against any later modification. – Guillaume Jan 31 '13 at 11:31
  • You will be surprised, but there is the situations where `final` fields could be in state before initialization and contain `null` value. – Andremoniy Jan 31 '13 at 11:32
  • 1
    @Andremoniy It can indeed happen when you let `this` escape while constructing an object for example. – assylias Jan 31 '13 at 11:34
  • 1
    @assylias I know, and I'm talking actually about this – Andremoniy Jan 31 '13 at 11:34
-1

UnsafeLazyInitialization.getInstance() can never return null.

I'll use @assylias's table.

                              Some Thread
---------------------------------------------------------------------
 10: resource = null; //default value                                  //write
=====================================================================
           Thread 1               |          Thread 2                
----------------------------------+----------------------------------
 11: a = resource;                | 21: x = resource;                  //read
 12: if (a == null)               | 22: if (x == null)               
 13:   resource = new Resource(); | 23:   resource = new Resource();   //write
 14: b = resource;                | 24: y = resource;                  //read
 15: return b;                    | 25: return y;    

I'll use the line numbers for Thread 1. Thread 1 sees the write on 10 before the read on 11, and the read on line 11 before the read on 14. These are intra-thread happens-before relationships and don't say anything about Thread 2. The read on line 14 returns a value defined by the JMM. Depending on the timing, it may be the Resource created on line 13, or it may be any value written by Thread 2. But that write has to happen-after the read on line 11. There is only one such write, the unsafe publish on line 23. The write to null on line 10 is not in scope because it happened before line 11 due to intra-thread ordering.

It doesn't matter if Resource is immutable or not. Most of the discussion so far has focused on inter-thread action where immutability would be relevant, but the reordering that would allow this method to return null is forbidden by intra-thread rules. The relevant section of the spec is JLS 17.4.7.

For each thread t, the actions performed by t in A are the same as would be generated by that thread in program-order in isolation, with each write w writing the value V(w), given that each read r sees the value V(W(r)). Values seen by each read are determined by the memory model. The program order given must reflect the program order in which the actions would be performed according to the intra-thread semantics of P.

This basically means that while reads and writes may be reordered, reads and writes to the same variable have to appear like they happen in order to the Thread that executes the reads and writes.

There's only a single write of null (on line 10). Either Thread can see its own copy of resource or the other Thread's, but it cannot see the earlier write to null after it reads either Resource.

As a side note, the initialization to null takes place in a separate thread. The section on Safe Publication in JCIP states:

Static initializers are executed by the JVM at class initialization time; because of internal synchronization in the JVM, this mechanism is guaranteed to safely publish any objects initialized in this way [JLS 12.4.2].

It may be worth trying to write a test that gets UnsafeLazyInitialization.getInstance() to return null, and that gets some of the proposed equivalent rewrites to return null. You'll see that they're not truly equivalent.

EDIT

Here's an example that separates reads and writes for clarity. Let's say there's a public static variable object.

public static Object object = new Integer(0);

Thread 1 writes to that object:

object = new Integer(1);
object = new Integer(2);
object = new Integer(3);

Thread 2 reads that object:

System.out.println(object);
System.out.println(object);
System.out.println(object);

Without any form of synchronization providing inter-thread happens-before relationships, Thread 2 can print out lots of different things.

1, 2, 3
0, 0, 0
3, 3, 3
1, 1, 3
etc.

But it cannot print out a decreasing sequence like 3, 2, 1. The intra-thread semantics specified in 17.4.7 severely limit reordering here. If instead of using object three times we changed the example to use three separate static variables, many more outputs would be possible because there would be no restrictions on reordering.

Craig P. Motlin
  • 26,452
  • 17
  • 99
  • 126
  • Not questioning the message of your answer, I'd say that testing is irrelevant now, because that's environment and JVM dependent. – gaborsch Feb 05 '13 at 17:48
  • A few comments: (i) the first part of your quote of 17.4.7 would apply if the thread that can (supposedly) return null did write something - if it only reads a non null then a null value (and therefore performs no write in between) only the second part applies "Values seen by each read are determined by the JMM". (ii) null (the default value) is safely published but not the subsequent write which creates a visibility issue. (iii) You can't prove that it can't return null by testing - if the JVM you use does not perform the "guilty reordering", it will never happen... – assylias Feb 05 '13 at 18:09
  • The key part is `Values seen by each read are determined by the memory model.` Assuming the write occurred in another thread, the first read of `resource` could be non-null, while the second read is null. The explanation is simple, there is no happens-before relationship between the write in one thread and the read in another, hence anything could be read. (I should note it is simple in hindsight of previous answers and comment discussions) – Alex DiCarlo Feb 05 '13 at 18:09
  • @GaborSch Behavior specified by the JMM will not vary between environments and hardware. Testing isn't strictly necessary, I think the JMM documentation is sufficient. I just recommended testing to get comfortable with my answer since it's contrarian. – Craig P. Motlin Feb 05 '13 at 18:40
  • I'm confused about your argument. The default null write happens before the object is published, so all threads can see that null value. Can you elaborate why you believe null could never, in any circumstance allowed by the JMM and JLS, be a valid outcome in a mulithreaded environment? – John Vint Feb 05 '13 at 18:56
  • @CraigP.Motlin The main question is: is it valid (concerning all specs, JMM, JLS) that a VM would reorder the 11 and 14 reads so, that 14 will happen BEFORE 11? Actually, because there is a *possible* write of 13, it **would not be a valid execution order**. A VM may optimize the execution order so, that excludes the not-executed branches (remaining just 2 reads, no writes), but to make this decision, **it must do the first read (11), and it must read not-null**, so **the 14 read cannot precede the 11 read**. So, I say it is NOT possible to return `null`. – gaborsch Feb 05 '13 at 19:52
  • 1
    @GaborSch That is an interesting argument. My answer proves that returning null is happens-before consistent - however, I have not proven that it satisfies the causality requirements of the JMM (and I would not know where to start) - your point might be the start of an answer. – assylias Feb 05 '13 at 19:57
  • @GaborSch I'm pretty sure we're saying the same thing in different ways. "happens-before" is a pretty interesting term that doesn't prevent reordering of instructions as long as effects are observed in a specific order. Even without the write, the reads on 14 and 11 can only see decreasingly stale values. – Craig P. Motlin Feb 05 '13 at 20:07
  • @CraigP.Motlin Yes, your statement that `null` is only written once, at creation time is also compulsory element of the reasoning. But I think that without 13 write, a VM could easily change the reads - why not, since *in a single thread* there is no difference (if you run multiple threads, your task is to take care). – gaborsch Feb 05 '13 at 20:21
  • @assylias I think we're reading 17.4.7 differently, and that it's an important section. As an aside, another way I look at the same thing is that the JMM is a simplified view of a real memory architecture in the same way that the JVM is a simplified view of a real CPU. Happens-before relationships just correspond with cached copies of memory, cache flushes, and cache invalidations. In this example, null may be cached in various locations, but nothing ever happens after line 10 which can flush it to other cache copies. – Craig P. Motlin Feb 05 '13 at 20:26
  • `Happens-before relationships just correspond with cached copies of memory, cache flushes, and cache invalidations.` Also corresponds to statement reordering which is why this problem arises. – John Vint Feb 05 '13 at 20:27
  • @JohnVint That's why my whole answer focuses on intra-thread happens-before instead of inter-thread happens-before. Realistic implementations of a JVM will have a Thread use a single cached copy of a memory location and that Thread observes its own reads and writes to that copy in order. It's almost too simple to write down, but it's there in 17.4.7. – Craig P. Motlin Feb 05 '13 at 20:29
  • But you still say it's impossible to return null. If you applied compiler re-ordering, which can be controlled by HB, then you would realize it can return null. – John Vint Feb 05 '13 at 20:33
  • @CraigP.Motlin Your example is not quite good, has side-effects (`System.out.println` is a non-atomic operation :) ) If you just assign to `o1`, `o2`, `o3` that's better. But in that case, nothing would prevent the compiler to reorder the reads. – gaborsch Feb 05 '13 at 21:02
  • @Craig P. Motlin Two things, 1) if the object wasn't static technically you can see the value null. 2) This doesn't account for the ordering example I gave as that maintains sequential consistency – John Vint Feb 05 '13 at 21:07
  • @JohnVint The example you gave has different semantics and is not an allowed reordering. I don't understand what it's supposed to illustrate. It does read, read, write instead of read, write, read. 17.4.7 specifically forbids that sort of reordering. – Craig P. Motlin Feb 05 '13 at 21:20
  • But it doesn't, you are allowed to order reads ahead of writes, so long as that write doesn't effect the later read (which in my case it doesnt). assylias asked this same question on the concurrency mailing list which agrees with my position http://cs.oswego.edu/pipermail/concurrency-interest/2013-February/date.html. In what way does my implementation change the expected outcome from the original. – John Vint Feb 05 '13 at 21:23
  • If the code was to read a, write b, read c, of course you could reorder those. But this code reads and write the same variable. In your reordering (read, read, write), the second read cannot see the third write anymore. It's the essence of this question. Your rewrite can return null, the original post cannot. – Craig P. Motlin Feb 05 '13 at 21:27
  • The specific language in the JLS is just: each write w writing the value V(w), given that each read r sees the value V(W(r)). It's just saying that reads and writes of the same variable are not reordered in the same Thread. – Craig P. Motlin Feb 05 '13 at 21:29
  • The original can because the compiler can make it look like mine, or the example listed in the mailing list. – John Vint Feb 05 '13 at 21:30
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/24046/discussion-between-craig-p-motlin-and-john-vint) – Craig P. Motlin Feb 06 '13 at 16:21