8

So, follwoing some job interviews I wanted to write a small program to check that i++ really is non-atomic in java, and that one should, in practice,add some locking to protect it. Turns out you should, but this is not the question here.

So I wrote this program here just to check it.

The thing is, it hangs. It seems that the main thread is stuck on on t1.join() line, even though both worker threads should finish because of the stop = true from previous line.

I found that the hanging stops if :

  • I add some printing inside the worker threads (as in the comments), probably causing the worker threads to sometime give up CPU or
  • If I mark the flag boolean stop as volatile, causing the write to immediately be seen by worker threads, or
  • If I mark the counter tas volatile... for this I have no idea what causes the un-hanging.

Can someone explain what's going on? why do I see the hang and why does it stop in those three cases?

public class Test {   

    static /* volatile */ long t = 0;
    static long[] counters = new long[2]; 
    static /* volatile */ boolean stop = false;

    static Object o = new Object();
    public static void main(String[] args) 
    {
        Thread t1 = createThread(0);
        Thread t2 = createThread(1);

        t1.start();
        t2.start();

        Thread.sleep(1000);

        stop = true;

        t1.join();
        t2.join();

        System.out.println("counter : " + t + " counters : " + counters[0] + ", " + counters[1]  + " (sum  : " + (counters[0] + counters[1]) + ")");

    }

    private static Thread createThread(final int i)
    {
        Thread thread = new Thread() { 
            public void run() {
                while (!stop)
                {
//                  synchronized (o) {                      
                        t++;
//                  }

//                  if (counters[i] % 1000000 == 0)
//                  {
//                      System.out.println(i + ")" + counters[i]); 
//                  }
                    counters[i]++;
                }
            }; 
        };
        return thread;
    }
}
Yossi Vainshtein
  • 3,845
  • 4
  • 23
  • 39

4 Answers4

8

It seems that the main thread is stuck on on t1.join() line, even though both worker threads should finish because of the stop = true from previous line.

In the absence of volatile, locking, or other safe publication mechanism, the JVM has no obligation to ever make stop = true visible to other threads. Specifically applied to your case, while your main thread sleeps for one second, the JIT compiler optimizes your while (!stop) hot loop into the equivalent of

if (!stop) {
    while (true) {
        ...
    }
}

This particular optimization is known as "hoisting" of the read action out of the loop.

I found that the hanging stops if :

  • I add some printing inside the worker threads (as in the comments), probably causing the worker threads to sometime give up CPU

No, it's because PrintStream::println is a synchronized method. All known JVMs will emit a memory fence at the CPU level to ensure the semantics of an "acquire" action (in this case, lock acquisition), and this will force a reload of the stop variable. This is not required by specification, just an implementation choice.

  • If I mark the flag boolean stop as volatile, causing the write to immediately be seen by worker threads

The specification actually has no wall clock-time requirements on when a volatile write must become visible to other threads, but in practice it is understood that it must become visible "very soon". So this change is the correct way to ensure that the write to stop is safely published to, and subsequently observed by, other threads reading it.

  • If I mark the counter t as volatile... for this I have no idea what causes the un-hanging.

These are again the indirect effects of what the JVM does to ensure the semantics of a volatile read, which is another kind of a "acquire" inter-thread action.

In summary, except for the change making stop a volatile variable, your program switches from hanging forever to completing due to the accidental side-effects of the underlying JVM implementation, which for simplicity does some more flushing/invalidation of thread-local state than required by the specification.

Community
  • 1
  • 1
Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436
  • Great answer, Marko! I found your comment from below particularly interesting, could you add it in as part of the answer? "much more importantly, a volatile operation inside the loop prevents the optimization that hoists the read of stop outside the loop" – Andrew Williamson Apr 03 '17 at 19:27
  • It's already there, I added the term "hoisting" to connect it to the existing material. – Marko Topolnik Apr 03 '17 at 19:34
  • Oh my bad, missed that :D – Andrew Williamson Apr 03 '17 at 19:39
  • @Marko Thanks for the great answer. It's full, clear and very informative. Can you give some links for further reading? And what is a "memory fence"? Thanks again! – Yossi Vainshtein Apr 03 '17 at 20:29
  • @YossiVainshtein Hm, i guess the stereotype "read JCIP" works for me too. A memory fence is a concept of CPU architectures, you'd better google it out. There's not enough space here to give it justice. – Marko Topolnik Apr 03 '17 at 20:33
2

Those might be the possible reasons :

If you are interested in digging deeper into the topic then I would suggest to go over "Java Concurrency in Practice by Brian Goetz" book.

Community
  • 1
  • 1
Ruben
  • 761
  • 5
  • 9
  • Just updated the answer. Marking "t" as volatile also make threads to reread "stop" from the memory since they might be stored in the same "cache line". I think adding @Contended to the definitions of those fields might prevent the second point from working since JVM might arrange them in a way so they are not cached together anymore. – Ruben Apr 03 '17 at 08:28
  • I guess both points are right, I thought that even without `volatile` keyword, the JVM would **eventualy** sync the memory between threads, but it hangs for some long minutes... – Yossi Vainshtein Apr 03 '17 at 08:28
  • Without volatile JVM might indeed synchronize the cache with memory, but there is no any reliable pattern in that case, it might really depend on various factors. Locally once it did after couple of seconds, but on the second run it did not for quite long. – Ruben Apr 03 '17 at 08:38
  • 1
    `this behavior might also reread "stop" from the memory`---much more importantly, a volatile operation inside the loop prevents the optimization that hoists the read of `stop` outside the loop. – Marko Topolnik Apr 03 '17 at 09:14
  • 1
    "Marking "t" as volatile triggers read from memory each time "t" is accessed" that's not true - not just in theory even in practice that's really not what volatile does (it will stay just fine in cache on x86 for example). People really have to stop thinking in terms of "reading from memory". – Voo Apr 03 '17 at 12:25
  • @Voo I agree that volatile does not trigger a roundtrip to the memory each time the variable is accessed but rather makes sure that only after a write the caches are either purged or synchronized. However referring to the effect of volatile as "reading fresh value from memory" is rather a simple way of modeling the end effect of it. Thx for the correction. – Ruben Apr 03 '17 at 14:04
  • 1
    `makes sure that only after a write the caches are either purged or synchronized`---this is also false. What it does depends on the underlying architecture as well as the specific code idiom being compiled. On x86 there is no flushing/purging of caches going on. In some very simple cases the compiler can even prove that no other thread is reading the volatile variable and treat it as a local var. The only reasonable model is the JMM itself, which is actually far, far simpler than the reality you're trying to refer to. – Marko Topolnik Apr 03 '17 at 14:09
  • 1
    @Ruben The problem with that explanation is that it is ill-defined, leads to wrong assumptions and does not even cover the most important parts of volatile (it is after all about reordering and visibility). For example you're saying that it means "reading fresh value from memory". But why does volatile then guarantee that I will see updates to non-volatile variables (under certain circumstances)? So that doesn't work. Or do you mean "make sure all values are read fresh from memory"? But why doesn't `synchronized(new Object())` have that effect then? And so on. – Voo Apr 03 '17 at 16:33
0

Marking a variable as volatile is a hint for the JVM, to flush/sync the related segments of cache between threads/cores when that variable is updated. Marking stop as volatile then has better behaviour (but not perfect, you may have some extra executions on your threads before they see the update).

Marking t as volatile puzzles me as to why it works, it may be that because this is such a small program, t and stop are in the same row in the cache, so when one gets flushed/synced the other does too.

System.out.println is thread safe, so there is some synchronization going on internally. Again, this may be causing some parts of the cache to be synced between the threads.

If anyone can add to this, please do, I also would really like to hear a more detailed answer on this.

Andrew Williamson
  • 8,299
  • 3
  • 34
  • 62
0

It does, actually, what it said to -- provides consistent access to the field between multiple threads, and you can see it.

Without volatile keyword, multithread access to the field is not guaranteed to be consistent, compilers can introduce some optimisations, like caching it in the CPU register, or not writing out from CPU core local cache to external memory or shared cache.


For part with non-volatile stop and volatile t

According to JSR-133 Java Memory Model specification, all writes (to any other field) before volatile field update are made visible, they are happened-before actions.

When you set stop flag after incrementing t, it will be not visible by subsequent read in the loop, but next increment (volatile-write) will make it visible.


See also

Java Language Specification: 8.3.1.4. volatile Fields

An article about Java Memory Model, from the author of Java theory and practice

Community
  • 1
  • 1
Pavlus
  • 1,651
  • 1
  • 13
  • 24
  • 1
    `when you set stop flag after incrementing t`---this actually never happens in OP's code. If it did happen, it would just mean that the thread observes its own write to the field. – Marko Topolnik Apr 03 '17 at 09:16
  • "after" in terms of everyday life and "when" in the meaning "if things go this way". Writes can happen in any order, but most noticeable case is a write to `stop` from other thread that happens after write to volatile field passed, but loop condition is not tested yet, in this case, loop will iterate one more time, then volatile write to counter flushes flag value, like it happened after condition check (it isn't in reality, but code observes it this way) – Pavlus Apr 03 '17 at 10:14
  • If you're talking about specified behavior, then what you describe isn't guaranteed. With `stop` being a non-volatile variable there is no happens-before between the main thread and the worker thread. – Marko Topolnik Apr 03 '17 at 10:50
  • I'm talking about "for this I have no idea what causes the un-hanging" part of the question, when `stop` is non-volatile and `t` is volatile, but `stop` changes are visible between threads, because they happens-before write to the volatile field `t`, as for JSR-133. – Pavlus Apr 03 '17 at 11:25
  • No, main thread's write to `stop` does not happen-before any action in any other thread. Main thread performs no interthread `release` actions. – Marko Topolnik Apr 03 '17 at 11:29
  • 1
    `all writes (to any other field) before volatile field update are made visible` --- this is incorrect. Only the writes occurring _in program order_ before the volatile field write are made visible. In other words, only the writes performed _by the same thread_ before it writes to the volatile field. Since the main thread doesn't update anything except the `stop` variable, there is actually no volatile write to speak of. – Marko Topolnik Apr 03 '17 at 11:48
  • @Marko Topolnik. Yes, I missed that. Thanks. – Hoopje Apr 03 '17 at 14:00
  • @Marko Topolnik. Yes, I wrote the comment too hastily. I'll delete it. – Hoopje Apr 04 '17 at 13:38