I'm wondering is it thread-safe to synchronize only on the add
method? Are there any possible issues if contains
is not synchronized as well?
Short answers: No and Yes.
There are two ways of explaining this:
The intuitive explanation
Java synchronization (in its various forms) guards against a number of things, including:
- Two threads updating shared state at the same time.
- One thread trying to read state while another is updating it.
- Threads seeing stale values because memory caches have not been written to main memory.
In your example, synchronizing on add
is sufficient to ensure that two threads cannot update the HashSet
simultaneously, and that both calls will be operating on the most recent HashSet
state.
However, if contains
is not synchronized as well, a contains
call could happen simultaneously with an add
call. This could lead to the contains
call seeing an intermediate state of the HashSet
, leading to an incorrect result, or worse. This can also happen if the calls are not simultaneous, due to changes not being flushed to main memory immediately and/or the reading thread not reading from main memory.
The Memory Model explanation
The JLS specifies the Java Memory Model which sets out the conditions that must be fulfilled by a multi-threaded application to guarantee that one thread sees the memory updates made by another. The model is expressed in mathematical language, and not easy to understand, but the gist is that visibility is guaranteed if and only if there is a chain of happens before relationships from the write to a subsequent read. If the write and read are in different threads, then synchronization between the threads is the primary source of these relationships. For example in
// thread one
synchronized (sharedLock) {
sharedVariable = 42;
}
// thread two
synchronized (sharedLock) {
other = sharedVariable;
}
Assuming that the thread one code is run before the thread two code, there is a happens before relationships between thread one releasing the lock and thread two acquiring it. With this and the "program order" relations, we can build a chain from the write of 42
to the assignment to other
. This is sufficient to guarantee that other
will be assigned 42
(or possibly a later value of the variable) and NOT any value in sharedVariable
before 42
was written to it.
Without the synchronized
block synchronizing on the same lock, the second thread could see a stale value of sharedVariable
; i.e. some value written to it before 42
was assigned to it.