2

I need to cache something in Scala in a multi-threaded environment.

Reading up on scalaz's Memo I found the following comment in the code for the immutable hash map memo:

As this memo uses a single var, it's thread-safe.

The code looks like this:

  def immutableMapMemo[K, V](m: Map[K, V]): Memo[K, V] = {
    var a = m

    memo[K, V](f =>
      k => {
        a get k getOrElse {
          val v = f(k)
          a = a updated (k, v)
          v
        }
      })
  }

Saying that this is thread safe goes against what I have read and learned so far about thread-safety on the JVM-platform; Reference updates may be atomic, but as I have understood it, the compiler may try do certain optimisations that upsets the happens-before relationship if you don't have a memory barrier. See for instance this post and this.

But I'm sure the scalaz folks are pretty smart. Maybe there's something special about the scope of a.

Is what the comment claims true, and if so, why?

Michael
  • 41,989
  • 11
  • 82
  • 128
Erik Madsen
  • 1,913
  • 3
  • 20
  • 34
  • What actually goes here against thread safety it in your experience? `var` update is reference assignment, so it's atomic. All other operations don't mutate objects, so any time in the `a` there is some consistent version of `Map[k,V]` – Odomontois Nov 12 '15 at 13:31
  • I'll update my question to clarify. – Erik Madsen Nov 12 '15 at 13:36

1 Answers1

5

First of all, since the var is not marked @volatile, you might see different versions of a in different threads. So you might do a calculation multiple times on different threads. This kind of defeats the purpose of memoization, but other than that it does not cause any harm, provided that the function being memoized is without side-effects.

Also, on the x86 architecture you will almost always see changes done on one thread on all other threads.

Regarding internal consistency of the map: As far as I know, in this case it is not possible to observe the map stored in a in an inconsistent state, because Map is not just observably immutable, but all versions of Map (Map1, Map2, Map3, Map4, HashMap1, HashTrieMap, HashMapCollision1, EmptyMap) have only final fields and are therefore safe according to the java memory model. However, relying on this is extremely fragile.

For example if a would contain a List or a Vector, you would be able to observe it in an inconsistent state when quickly updating it from different threads. The reason for this is that these data structures are observably immutable, but do use mutable state internally for performance optimization.

So bottom line: don't rely on this for memoization in a multithreaded context.

See this thread on scala-user for a discussion of a very similar problem

See this thread for why even basic observably immutable data structures such as List and Vector can be observed in an inconsistent state unless using safe publishing via @volatile or another safe mechanism such as actors.

Rüdiger Klaehn
  • 12,445
  • 3
  • 41
  • 57