26

Suppose I have a static complex object that gets periodically updated by a pool of threads, and read more or less continually in a long-running thread. The object itself is always immutable and reflects the most recent state of something.

class Foo() { int a, b; }
static Foo theFoo;
void updateFoo(int newA, int newB) {
  f = new Foo();
  f.a = newA;
  f.b = newB;
  // HERE
  theFoo = f;
}
void readFoo() {
  Foo f = theFoo;
  // use f...
}

I do not care in the least whether my reader sees the old or the new Foo, however I need to see a fully initialized object. IIUC, The Java spec says that without a memory barrier in HERE, I may see an object with f.b initialized but f.a not yet committed to memory. My program is a real-world program that will sooner or later commit stuff to memory, so I don't need to actually commit the new value of theFoo to memory right away (though it wouldn't hurt).

What do you think is the most readable way to implement the memory barrier ? I am willing to pay a little performance price for the sake of readability if need be. I think I can just synchronize the assignment to Foo and that would work, but I'm not sure it's very obvious to someone reading the code why I do that. I could also synchronize the whole initialization of the new Foo, but that would introduce more locking that actually needed.

How would you write it so that it's as readable as possible ?
Bonus kudos for a Scala version :)

Jean
  • 10,545
  • 2
  • 31
  • 31
  • From memory, I don't think you're reading the Java spec correctly. Could you point to the section where you think it says that. – Burleigh Bear Oct 18 '10 at 23:53
  • http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html and specifically "17.11: Out of order writes" illustrates this (the rules themselves are on the same page at 17.2 and 17.3). I may be reading wrong though. – Jean Oct 19 '10 at 00:19
  • @Burleigh Bear: I believe @Jean is correct in his assessment here... see my answer and push back if I've gotten it wrong. Reading the JMM specifics definitely reminds one of sausage being made! – andersoj Oct 19 '10 at 01:33
  • @andersoj: I agree with your (and Jean's) assessment, and have voted for your answer. – Burleigh Bear Oct 19 '10 at 06:28
  • @Jean, good edits to your question. I have attempted to address readability, but like anything else concurrency related, *documentation* in your code is the best way to make it readable. You could also use the JCiP concurrency annotations to more programmatically document things (i.e., @Immutable, @ProtectedBy, etc...): http://www.javaconcurrencyinpractice.com/annotations/doc/index.html – andersoj Oct 19 '10 at 15:15
  • I think so. `final` and `volatile` sound like good and clear technical solutions; the rest of the readability will come from good documentation. Thanks for pointing out JCiP concurrency annotations: they will indeed improve readability. – Jean Oct 19 '10 at 15:26

2 Answers2

45

Short Answers to the Original Question

  • If Foo is immutable, simply making the fields final will ensure complete initialization and consistent visibility of fields to all threads irrespective of synchronization.
  • Whether or not Foo is immutable, publication via volatile theFoo or AtomicReference<Foo> theFoo is sufficient to ensure that writes to its fields are visible to any thread reading via theFoo reference
  • Using a plain assignment to theFoo, reader threads are never guaranteed to see any update
  • In my opinion, and based on JCiP, the "most readable way to implement the memory barrier" is AtomicReference<Foo>, with explicit synchronization coming in second, and use of volatile coming in third
  • Sadly, I have nothing to offer in Scala

You can use volatile

I blame you. Now I'm hooked, I've broken out JCiP, and now I'm wondering if any code I've ever written is correct. The code snippet above is, in fact, potentially inconsistent. (Edit: see the section below on Safe publication via volatile.) The reading thread could also see stale (in this case, whatever the default values for a and b were) for unbounded time. You can do one of the following to introduce a happens-before edge:

  • Publish via volatile, which creates a happens-before edge equivalent to a monitorenter (read side) or monitorexit (write side)
  • Use final fields and initialize the values in a constructor before publication
  • Introduce a synchronized block when writing the new values to theFoo object
  • Use AtomicInteger fields

These gets the write ordering solved (and solves their visibility issues). Then you need to address visibility of the new theFoo reference. Here, volatile is appropriate -- JCiP says in section 3.1.4 "Volatile variables", (and here, the variable is theFoo):

You can use volatile variables only when all the following criteria are met:
  • Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;
  • The variable does not participate in invariants with other state variables; and
  • Locking is not required for any other reason while the variable is being accessed

If you do the following, you're golden:

class Foo { 
  // it turns out these fields may not be final, with the volatile publish, 
  // the values will be seen under the new JMM
  final int a, b; 
  Foo(final int a; final int b) 
  { this.a = a; this.b=b; }
}

// without volatile here, separate threads A' calling readFoo()
// may never see the new theFoo value, written by thread A 
static volatile Foo theFoo;
void updateFoo(int newA, int newB) {
  f = new Foo(newA,newB);
  theFoo = f;
}
void readFoo() {
  final Foo f = theFoo;
  // use f...
}

Straightforward and Readable

Several folks on this and other threads (thanks @John V) note that the authorities on these issues emphasize the importance of documentation of synchronization behavior and assumptions. JCiP talks in detail about this, provides a set of annotations that can be used for documentation and static checking, and you can also look at the JMM Cookbook for indicators about specific behaviors that would require documentation and links to the appropriate references. Doug Lea has also prepared a list of issues to consider when documenting concurrency behavior. Documentation is appropriate particularly because of the concern, skepticism, and confusion surrounding concurrency issues (on SO: "Has java concurrency cynicism gone too far?"). Also, tools like FindBugs are now providing static checking rules to notice violations of JCiP annotation semantics, like "Inconsistent Synchronization: IS_FIELD-NOT_GUARDED".

Until you think you have a reason to do otherwise, it's probably best to proceed with the most readable solution, something like this (thanks, @Burleigh Bear), using the @Immutable and @GuardedBy annotations.

@Immutable
class Foo { 
  final int a, b; 
  Foo(final int a; final int b) { this.a = a; this.b=b; }
}

static final Object FooSync theFooSync = new Object();

@GuardedBy("theFooSync");
static Foo theFoo;

void updateFoo(final int newA, final int newB) {
  f = new Foo(newA,newB);
  synchronized (theFooSync) {theFoo = f;}
}
void readFoo() {
  final Foo f;
  synchronized(theFooSync){f = theFoo;}
  // use f...
}

or, possibly, since it's cleaner:

static AtomicReference<Foo> theFoo;

void updateFoo(final int newA, final int newB) {
  theFoo.set(new Foo(newA,newB)); }
void readFoo() { Foo f = theFoo.get(); ... }

When is it appropriate to use volatile

First, note that this question pertains to the question here, but has been addressed many, many times on SO:

In fact, a google search: "site:stackoverflow.com +java +volatile +keyword" returns 355 distinct results. Use of volatile is, at best, a volatile decision. When is it appropriate? The JCiP gives some abstract guidance (cited above). I'll collect some more practical guidelines here:

  • I like this answer: "volatile can be used to safely publish immutable objects", which neatly encapsulates most of the range of use one might expect from an application programmer.
  • @mdma's answer here: "volatile is most useful in lock-free algorithms" summarizes another class of uses—special purpose, lock-free algorithms which are sufficiently performance sensitive to merit careful analysis and validation by an expert.

  • Safe Publication via volatile

    Following up on @Jed Wesley-Smith, it appears that volatile now provides stronger guarantees (since JSR-133), and the earlier assertion "You can use volatile provided the object published is immutable" is sufficient but perhaps not necessary.

    Looking at the JMM FAQ, the two entries How do final fields work under the new JMM? and What does volatile do? aren't really dealt with together, but I think the second gives us what we need:

    The difference is that it is now no longer so easy to reorder normal field accesses around them. Writing to a volatile field has the same memory effect as a monitor release, and reading from a volatile field has the same memory effect as a monitor acquire. In effect, because the new memory model places stricter constraints on reordering of volatile field accesses with other field accesses, volatile or not, anything that was visible to thread A when it writes to volatile field f becomes visible to thread B when it reads f.

    I'll note that, despite several rereadings of JCiP, the relevant text there didn't leap out to me until Jed pointed it out. It's on p. 38, section 3.1.4, and it says more or less the same thing as this preceding quote -- the published object need only be effectively immutable, no final fields required, QED.

    Older stuff, kept for accountability

    One comment: Any reason why newA and newB can't be arguments to the constructor? Then you can rely on publication rules for constructors...

    Also, using an AtomicReference likely clears up any uncertainty (and may buy you other benefits depending on what you need to get done in the rest of the class...) Also, someone smarter than me can tell you if volatile would solve this, but it always seems cryptic to me...

    In further review, I believe that the comment from @Burleigh Bear above is correct --- (EDIT: see below) you actually don't have to worry about out-of-sequence ordering here, since you are publishing a new object to theFoo. While another thread could conceivably see inconsistent values for newA and newB as described in JLS 17.11, that can't happen here because they will be committed to memory before the other thread gets ahold of a reference to the new f = new Foo() instance you've created... this is safe one-time publication. On the other hand, if you wrote

    void updateFoo(int newA, int newB) {
      f = new Foo(); theFoo = f;     
      f.a = newA; f.b = newB;
    }
    

    But in that case the synchronization issues are fairly transparent, and ordering is the least of your worries. For some useful guidance on volatile, take a look at this developerWorks article.

    However, you may have an issue where separate reader threads can see the old value for theFoo for unbounded amounts of time. In practice, this seldom happens. However, the JVM may be allowed to cache away the value of the theFoo reference in another thread's context. I'm quite sure marking theFoo as volatile will address this, as will any kind of synchronizer or AtomicReference.

    Community
    • 1
    • 1
    andersoj
    • 22,406
    • 7
    • 62
    • 73
    • You are right, it can and it is actually done in a constructor in my actual code. I was not aware there were any specific publication rules for constructors, so I will search in that direction. Thanks for the tip :) – Jean Oct 19 '10 at 00:30
    • So I may be wrong about special treatment for constructors. I believe this special treatment *only* applies to `final` fields (which I think yours would be in this case, though you haven't made that explicit). See http://www.cs.umd.edu/~pugh/java/memoryModel/jsr133.pdf, all the references to constructors are in the context of initializing `final` fields. – andersoj Oct 19 '10 at 01:02
    • 1
      Whoa. Thanks a lot for the lot of research and the detailed and easy to understand explanation. Now I know how I should write my code. Thank you very much. – Jean Oct 19 '10 at 01:39
    • 1
      Just for completeness, I'll add that if you're trying to maximise readability, I'd forget all of this, and just synchronize the two methods. – Burleigh Bear Oct 19 '10 at 07:36
    • I dont get it. Why dont you just declare theFoo volatile. You can then get rid of synchronized. You will also save time because a volatile write is about 1/3 the cost of a monitor enter/exit. I see it in your answer but I would argue it would be more readable. – John Vint Oct 19 '10 at 13:56
    • @John V: My answer includes the "just use volatile" answer, but I think many would disagree that it is more readable. In my experience, `volatile` is a much-misunderstood corner of Java, requires careful analysis to ensure it's correct, and may easily get lost as the class is evolved over time. I tried to give the answer to the question, but also note that `synchronized` will be understood by more folks, is durable over a larger range of uses, and is *probably* preferred unless there is a well-understood performance driver. – andersoj Oct 19 '10 at 14:39
    • @John V: JCiP discourages use of volatile except when they "simplify implement[ation] and verif[ication]", and points the implementer towards Atomic variables, which can "often be used as better volatile variables." (JCiP 3.1.1) While I don't normally appeal to authority, JCiP has a sterling reputation, and I tend to defer unless I have concrete evidence to the contrary. – andersoj Oct 19 '10 at 14:41
    • Also note that `volatile` only works reliably in the specific case of immutable objects. While I was pushing the questioner in this direction, I'm not sure we can make that assumption... if the `theFoo` is actually mutable or becomes so in the future, `volatile` will not guarantee ordering, consistency, or visibility of future writes to its fields. – andersoj Oct 19 '10 at 14:44
    • @andersoj I agree that it would be better to use an AtomicReference. However the volatile keyword since 1.5 ensures a happens-before ordering. It will guarantee ordering, consistency and visibility just as much as an AtomicReference would. I was arguing that in your example, keeping all other code the same, declaring the field volatile would in my opinion be a better implementation. – John Vint Oct 19 '10 at 16:58
    • @andersoj For readability you can make that argument. It is more obvious that the variable theFoo is guarded against whatever object its synchronizing on. But I guess if you dont have acceptable documentation on the field itself then it too can be lost in the future. One of the biggest points in JCiP is to document thread safety. – John Vint Oct 19 '10 at 17:01
    • Accidentally deleted this comment - @andersoj Last point, I missed one of the last things your wrote. Today, most processors offer volatile reads to be close to a normal read. If we are going for pure throughput, and there are a lot of reads, the difference is pretty substantial. – John Vint Oct 19 '10 at 19:03
    • @John V: That's good information... I wonder if there's a source that quantifies that somehow... (off to google scholar...) – andersoj Oct 19 '10 at 19:05
    • @andersoj your argument that volatile only works correctly with immutable objects is not correct (it was under the old JMM but under the latest JMM it isn't - see 17.4.7) – all writes that are before the volatile-write happen-before the volatile read. As I said elsewhere final is pretty much always preferable anyway though for its additional guarantees. – Jed Wesley-Smith Oct 20 '10 at 01:15
    • @Jed Wesley-Smith agreed, I think we crossed paths as I was responding... can you read the new text and see if I've got it right now? – andersoj Oct 20 '10 at 01:16
    4

    Having an immutable Foo with final a and b fields solves the visibility issues with the default values, but so does making theFoo volatile.

    Personally I like having immutable value classes anyway as they much harder to misuse.

    Jed Wesley-Smith
    • 4,686
    • 18
    • 20
    • @Jed Wesley-Smith: See my answer--I believe in this case you have to do both to get the desired guarantees. – andersoj Oct 19 '10 at 01:24
    • @andersoj a write to a volatile guarantees a happens-before relationship (the Foo is safely published). This means that everything that happened-before the write to theFoo will be seen if the new foo value is available. As I said, the immutable Foo class is preferable as it guarantees nobody will be able to change the state or accidentally re-order the code and change the volatile write semantics, but volatile is enough on its own. – Jed Wesley-Smith Oct 20 '10 at 00:37
    • @andersj also, he specifically states that the requirement is for a properly constructed Foo, not necessarily the latest one, so volatile is not necessary for the latest object guarantee (visibility of any particular write), therefore an immutable Foo with a static initializer to create a non-null initial theFoo will guarantee a non-null theFoo at all times. – Jed Wesley-Smith Oct 20 '10 at 00:45
    • @Jed Wesley-Smith: So I think you've convinced me, though I have to confess it's really hard to dig this out of the available documentation... I'm writing up another change to my answer to capture what I think I've learned... – andersoj Oct 20 '10 at 00:48
    • @Jed yes re: the "volatile is not necessary" except that without it, the latency to see updates from other threads is completely unbounded. While the questioner doesn't care exactly which update is seen, he does care that updates are eventually seen. My understanding is that, without volatile or any other edge, the reader thread is free to cache the value effectively forever, yes? – andersoj Oct 20 '10 at 01:20
    • You might want to comment here: http://stackoverflow.com/questions/3488703/when-exactly-do-you-use-the-volatile-keyword-in-java/3488824#3488824 – andersoj Oct 20 '10 at 01:21
    • Is immediate commit to memory necessary ? Yes, and no. @andersoj, you are perfectly right in stating that in absence of any other edge, the other threads are never guaranteed to see any change. On the other hand, as I specified in the original question, all my threads do encounter memory barriers fairly commonly, and they will shortly commit to memory *anyway*, always fast enough for my needs. So for my functionality, it is not *needed*, though the code in the question doesn't make it obvious. However... it's more readable and less error-prone to make it volatile anyway. So I'm doing it. – Jean Oct 20 '10 at 04:59
    • And then again, while I'm not very fond of the syntax of AtomicReference, it has two very definite advantages over a simply volatile field : 1. it makes it unmistakably obvious this is a lock-free shared object. `AtomicReference` is pretty explicit in that I'm building, well, an atomic reference. 2. It forces client code to use get() and anyone sane will cache the value in a local variable. I can imagine client code accessing theFoo.a and theFoo.b separately, and getting values that are both correct, but don't agree with each other. I don't see theFoo.get().a and theFoo.get().b used as much. – Jean Oct 20 '10 at 05:05
    • @Jean: Note difference between your thread committing to memory and other threads flushing their cache and reading from memory. Both have to occur -- and in the case of the read-side, it is dangerous to rely on this occurring unless those threads have some other interaction with this reference/object... your situation is very close to the canonical "how do I stop a thread?" question, e.g. here http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile – andersoj Oct 20 '10 at 12:27
    • Ah, yes you're right, I had not thought of that. Happens, my reader threads also have memory barriers. But it's nice to point it out. Anyway, no problem with the `AtomicReference` or with the volatile solution. – Jean Oct 20 '10 at 16:52