5

I've seen this claimed in several places, including on SO: https://stackoverflow.com/a/20467457/243238, https://stackoverflow.com/a/4400389/243238. I get the point that locks are not needed to modify the data but you end up with multiple versions of it after concurrent modifications. That doesn't seem very useful in practice. I've tried to describe this with a simple scenario below:

Let's say we have 2 threads A and B. They are both modifying a purely functional dictionary D. They don't need locks at this point because the data is immutable so they output new dictionaries DA and DB. Now my question is how do you reconcile DA and DB so that later operations can see a single view of the data?

EDIT: The answers are suggesting to use a merge function over DA and DB. I don't see how that solves the problem since there could be another thread C which runs concurrently with the merge operation. The claim is that purely functional data structures are lock-free but using a merge function sounds more like eventual consistency which is a different thing.

Daniel
  • 26,899
  • 12
  • 60
  • 88
  • This is not an answer to your question, but I think if you want to share modifications to a data structure you should consider using `STM` as [discussed in this tutorial](https://www.fpcomplete.com/school/advanced-haskell/beautiful-concurrency). – Gabriella Gonzalez Apr 16 '14 at 22:09

5 Answers5

5

Good observation. But this just means that parallelism in the presence of a "global state" is difficult or impossible, regardless of the programming language.

In impure languages, this is just less obvious. You may think that you can get away with locking, synchronization and all the highly complex code around it.

Whereas in pure languages, you have n pure state transformers and cannot but realize that if you apply them in pareallel to some initial state S0 you end up with so many states S1, S2, S3 ... Sn

Now, the purity just guarantees that the parallel transformations of S0 do not interfere with each other in any way. It doesn't solve your program design issues, such as what that state means, if it is maybe too coarse, or too fine grained and if you can compute an end result form various transformed states.

Ingo
  • 36,037
  • 5
  • 53
  • 100
3

The point with functional data structures is atomicity. In your scenario, DA and DB are guaranteed to be consistent, how to merge them into a consistent value depends on your application (functional programming will not solve ALL your problems for you). This is still however more than what you get in imperative paradigm where DA and DB depend on how the threads are sequenced.

Anupam Jain
  • 7,851
  • 2
  • 39
  • 74
2

In general there are a lot of ways to do it. In practice, you may want to ensure that either DA or DB is canonical. If that cannot be done then there should be a merge process

Thread A/B: 

da  <- readSharedMemory at D
da' <- modifyDict da
mergeSharedMemory da'

where mergeSharedMemory is capable of combining two dictionaries with a similar history in a consistent fashion. Generally, these kinds of structures are well-studied in the database literature as CRDTs—"convergent replicated data types" and "commutative replicated data types". The INRIA Paper has a great deal of detail about this.

J. Abrahamson
  • 72,246
  • 9
  • 135
  • 180
  • I see how this makes sense in a distributed system. I'm still not convinced that this mechanism is equivalent to a lock-free data structure in a single system environment with multithreading. – Daniel Apr 17 '14 at 22:13
  • I think I responded less to that question and more to the body. I don't think this is lock-free—you need an atomic read-merge-write operation. – J. Abrahamson Apr 18 '14 at 03:23
2

Purely functional data structures are the easiest way to make lock-free mutable data structures. Given an arbitrary purely functional data structure T, you can define a lock-free mutable data structure newtype U = U (IORef T). Given an arbitrary function f :: T -> (T, a), you can write

mf :: U -> IO a
mf (U ref) = atomicModifyIORef' ref f

Why, then, is there a whole literature of fancy techniques for making lock-free data structures? Because not all data structures can be implemented efficiently enough in a purely functional way. Any operation that takes too long will slow all the threads down and lead to a backlog of updates.

dfeuer
  • 48,079
  • 5
  • 63
  • 167
1

You create a third function taking DA and DB that produces DAB.

Lasse V. Karlsen
  • 380,855
  • 102
  • 628
  • 825