3

In this example, is it sufficient to declare the parameter obj as final to safely use it in the thread, below?

public void doSomethingAsync (final Object obj)
{
  Thread thread = new Thread ()
  {
    @Override public void run () { ... do something with obj ... }
  }

  thread.start ();
}

At first glance it may seem fine. A caller invokes doSomethingAsync and obj gets cached until needed in the thread.

But what happens if there are a burst of calls to doSomethingAsync such that they complete before the threads have done anything with obj?

If the Java compiler simply makes obj into a member variable, the last call to doSomethingAsync will overwrite the prior values of obj, making prior invocations of the thread use a wrong value. Or, does the compiler generate a queue or some dimensioned storage for obj so that each thread gets the proper value?

Peri Hartman
  • 19,314
  • 18
  • 55
  • 101
  • Some of this will depend on exactly what type of object you're dealing with. The reference itself will certainly be thread safe, but modifications to obj can still race and conflict. – Louis Wasserman Mar 04 '17 at 03:00
  • Yes, I realize the content of obj can change. That's a different issue. But, why is obj itself thread safe. Please answer and I'll credit you points. – Peri Hartman Mar 04 '17 at 03:03
  • I do not think the "final" attribute of doSomething(...) transfers to run(). They are different objects. The final is required so it is not modified in the method. – bichito Mar 04 '17 at 03:05
  • I think I might have missed the thrust of your question. In order to use variables inside @Override run method, they must be final. _And_ declaring them final as you have done is sufficient to use them in that method. As I and others have pointed out, declaring a method parameter/variable final does not make that variable thread safe (of course). The compiler will not generate storage to hold a deep copy of parameters/variables for use by run() at execution time __unless__ the parameter/variable is a primitive. Instead compiler stores a COPY of the REFERENCE that you declared final. – Teto Mar 04 '17 at 04:15

3 Answers3

2

At first glance it may seem fine. A caller invokes doSomethingAsync and obj gets cached until needed in the thread.

The object is not "cached", the variable reference merely cannot be assigned to another object. The final keyword only prevents the variable from being re-assigned, it does not prevent the object that is being referenced from being mutated.

But what happens if there are a burst of calls to doSomethingAsync such that they complete before the threads have done anything with obj?

If the threads modify the referenced object the behavior would be undefined, they would be competing for the object and their reference to the object may have "old" values because the object was not synchronized between the threads. If the object is immutable, it has no state and cannot be changed, then it is inherently thread safe.

If the Java compiler simply makes obj into a method variable, the last call to doSomethingAsync will overwrite the prior values of obj, making prior invocations of the thread use a wrong value. Or, does the compiler generate a queue or some dimensioned storage for obj so that each thread gets the proper value?

The compiler does not guarantee that the threads get executed in order, threads run concurrently. This is why the synchronize keyword exists, so that you can guarantee that when you reference the object you reference the same state of the object that all of the other threads see. Obviously this is at a cost to performance so it is recommended to only pass immutable objects into threads so that you don't have to synchronize the threads every time you do something with the object.

Jake Holzinger
  • 5,783
  • 2
  • 19
  • 33
  • I can only put so many words in the title. In my description, I describe a different problem from what you are answering. If you have an answer, that would be great. Please reread. – Peri Hartman Mar 04 '17 at 03:19
0

Large edit here, based on a conversation the Original Poster and I had in chat.

It seems Peri's real question was about the way Java stored local variables like "obj" for use by Thread. This is called "captured variables" if you want to google it yourself. There is a nice discussion here.

Basically what happens is that all your local variable, the ones stored on the stack, plus the "this" pointer get copied into your local class (Thread in this case) when the local class is instantiated.

Original answer follows for the sake of the comments. But it is now obsolete.

Each time you call doSomethingAsync you are creating a new thread. If you call doSomethingAsync just once with a particular object, and then you modify that same object in the calling thread, then you have no idea what what the asynchronous thread will do. It might "do something with the object" before you modify it in the calling thread, after you modify in the calling thread or even WHILE you are concurrently modifying it in the calling thread. Unless the Object itself is thread safe this will cause problems.

Similarly, if you call doSomethignAsync twice with the same object, then you have no idea which asynchronous thread will modify the object first, and no guarantee they will not act concurrently on the same object.

Finally, if you call doSomethignAsync twice with 2 different objects then you don't know which asynchronous thread will act on its own object first, but you don't care, because they can't conflict with each other unless the objects have Static mutable variables (class variables) that are being modified).

If you require that one task get completed before another task and in the order submitted, then a single threaded ExecutorService is your answer.

Community
  • 1
  • 1
Teto
  • 475
  • 2
  • 10
  • You are answering a different question. The question is about which "obj" each thread will get. For simplicity, you could consider that "obj" is an int, rather than an Object. – Peri Hartman Mar 04 '17 at 03:21
  • Objects and ints are entirely different things! Okay, if they are ints, the ints are thread safe. Each new thread will have its own unique int. It will have a _copy_ of the int you pass to it when you made the call. It may not get acted upon until later, but its copy is thread safe. And it it doesn't need to be declared "final" to make it so. – Teto Mar 04 '17 at 03:25
  • Do you have a reference for that information? I did some searching but was unable to find documentation for this particular case. Thanks! – Peri Hartman Mar 04 '17 at 03:29
  • Following up on my previous comment. An instance of Object (or a subclass thereof, is passed by reference. Meaning the actual object is not copied into a unique place for use by the the method Instead the method receives a copy of a pointer to the Object. An int is passed to method differently. Methods receive a private copy of the int, not a pointer to it. Finally, you create the Thread in your code, a private copy of all the parameters passed into the enclosing scope. So it has a copy of the "reference to Object" or a private copy of the int for use later at execution time. – Teto Mar 04 '17 at 03:35
  • Regarding the reference. I googled "passing by reference or values" and I got the following. http://www.javaworld.com/article/2077424/learn-java/does-java-pass-by-reference-or-pass-by-value.html I then googled "are primitives thread safe in Java" and I got this link. http://stackoverflow.com/questions/9278764/are-primitive-datatypes-thread-safe-in-java The exceptions to the thread safety of primitives discussed there don't apply since you are getting a new copy. Its the same as my 3rd example in my reply where you created a new Obect each time you called 'doSomethingAynch'. – Teto Mar 04 '17 at 03:43
  • neither of those refs address the question at hand. The answer depends on the compiler implementation, which hopefully is driven by the language semantics. Do you see that if the "final" merely causes the variable to stored as a class member variable that my example will fail? In my case, the desired implementation would be to pass "obj" to the thread at creation time and store it as a member variable there. Maybe that's what it does, I'm trying to determine that empirically, now. – Peri Hartman Mar 04 '17 at 04:40
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/137194/discussion-between-teto-and-peri-hartman). – Teto Mar 04 '17 at 04:50
0

If the Java compiler simply makes obj into a member variable, the last call to doSomethingAsync will overwrite the prior values of obj, making prior invocations of the thread use a wrong value

No, this will not happen. The subsequent call to doSomethingAsync cannot overwrite the obj captured by previous invocations of doSomethingAsync. This stands even if you remove the final keyword (assume java let you do it for just this time).

I think your question ultimately is about how closure works/is implemented in java. However, your code is not demonstrating the complication in the proper way because the code is not even trying to modify the variable obj in the same lexical scope.

In a way Java is not really capturing the variable obj, but its value. You could write the your code in a different way, and the overall effect is the same:

class YourThread extends Thread {
    private Object param;

    public YourThread (Object obj){
        param = obj;
    }

    @Override
    public void run(){
        //do something with your param
    }
}

and you no longer need the final keyword:

public void doSomethingAsync (Object obj){
    Thread t = new YourThread (obj);
    t.start();
}

Now, say you have two instances of YourThread created, how could the second instance modify what has been passed as parameter to the first instance?


Closure in Other Languages

In other languages, magical things can indeed happen, but to show it you need to write the code slightly different:

public void doSomethingAsync (Object obj){
    //Here let's assume obj is not null
    Thread thread = new Thread (){
        @Override 
        public void run () { ... /*do something with obj*/ ... }
    }

    thread.start ();
    obj = null;
}

This is not valid Java code, but in certain languages code like that is allowed. And the thread, when its run method is executed, might see obj as null.

Similarly, in the below code (again, not valid in Java), thread2 could potentially impact thread1 if thread2 executes first and changes obj in its run method:

public void doSomethingAsync (Object obj){

    Thread thread1 = new Thread (){
        @Override 
        public void run () { ... /*do something with obj*/ ... }
    }

    thread1.start ();

    Thread thread2 = new Thread (){
        @Override 
        public void run () { ... /*do something with obj*/ ... }
    }

    thread2.start ();
}

Back to Java

The reason Java forces you to put a final on obj is that although Java's syntax looks extremely similar to the closure syntax used in other languages, it is not doing the same closure semantics. Knowing it is final, Java does not need to create capturing object (thus additional heap allocation), but use something similar to YourThread behind the scene. See this link for more details

Xinchao
  • 2,929
  • 1
  • 24
  • 39