4

I'm trying to code some Clojure style Agents in F# using MailboxProcessors. Here's what I have so far:

namespace GameEngine

type Agent<'T>(inital:'T) = 
    let mutable state:'T = inital

    let queue = new MailboxProcessor<'T -> 'T>( fun inbox ->
            let rec loop count = 
                async {
                    let! msg = inbox.Receive()
                    state <- msg(state)
                    return! loop(count + 1)
                }
            loop 0)

    do
        queue.Start()

    member self.Send(action:'T -> 'T) =
        queue.Post(action)
    member self.Deref() =
        state

So the basic idea is that we have a mutable state that can be updated by calling .Send(). My question is, will my messages ever be out of order? If msg A is sent before B will the async function above always process A before B?

Seems like there should be a class like this already in F#? Am I reinventing the wheel?

Alex Miller
  • 69,183
  • 25
  • 122
  • 167
Timothy Baldridge
  • 10,455
  • 1
  • 44
  • 80
  • You can avoid mutable state by passing the current state as a parameter to the "loop" function, if you want... – Joel Mueller Jan 28 '11 at 15:27
  • That would work, but the key issue would be the Deref() method. This method allows for infinite non-blocking readers. Basically this is a threading-enabled-one-writer-many-reader-lock. – Timothy Baldridge Jan 28 '11 at 15:51
  • So you are recreating ReaderWriterLockSlim, but without the added safety of blocking reads during a write? PostAndAsyncReply would be a safer and non blocking way of implementing Deref, but it would require the calling code to be async – Joel Mueller Jan 28 '11 at 16:45
  • Maybe I should clarify. This class is only to be used with immutable classes. In that way, .Deref() is always safe (see http://stackoverflow.com/questions/9666/is-accessing-a-variable-in-c-an-atomic-operation ). You may get a older copy of the state than that that is being currently processed, but you can always read the state and be assured that the data is in a consistent state. Immutability is key in this case. The only mutable objects in my program are the states within Agents. Sure, I could use ReadWriteSlim, but with actors you also get automatic concurrency. – Timothy Baldridge Jan 28 '11 at 16:50
  • I would also recommend this video http://blip.tv/file/812787 It's talking about Clojure, but many of the concurrency concepts in Clojure can be ported to F#, and that's what I'm doing. – Timothy Baldridge Jan 28 '11 at 16:55
  • This sounds like a really bad idea. Why would you ever want to expose the internal mutable state of an agent in a thread unsafe way like this? – J D Jan 30 '11 at 18:41
  • Ah, according to the Clojure docs the state is supposed to be **immutable** so you want a `ref` in F# and set it from the accumulator as Joel described. However, this is an XY question: you described the solution instead of the problem. You mention using this for physics below, in which case you might want to ask how to solve that problem in F# instead. – J D Jan 30 '11 at 19:02

2 Answers2

2

If msg A is sent before B will the async function above always process A before B?

Yes. (You can see the code for Mailbox

http://fsharppowerpack.codeplex.com/SourceControl/changeset/view/54799#970072

browse to compiler\2.0\Nov2010\src\fsharp\FSharp.Core\control.fs, and eventually see e.g.

   member x.Post(msg) =
       lock syncRoot (fun () ->
           arrivals.Enqueue(msg);
           ...

which shows it's just a queue under a lock.)

Seems like there should be a class like this already in F#? Am I reinventing the wheel?

Well, it's not immediately clear to me how this is different from just updating a mutable global variable willy-nilly (modulo atomicity of simultaneous updates; you said "before" in the question, so I am unclear if that aspect matters to you). What's the context for wanting this?

Brian
  • 117,631
  • 17
  • 236
  • 300
  • It's a co-currency method. With the above Agent class I'm programming several physics simulations. Each object in my system is an Agent with an immutable state. By passing lambda statements to the .Send() methods I can apply changes to the objects in the simulation. The resulting code is thread-safe, and highly scalable. The Clojure docs probably explain it better: http://clojure.org/agents – Timothy Baldridge Jan 28 '11 at 02:28
  • @Timothy: This doesn't look highly scalable to me. How do you assure locality of reference, for example? – J D Jan 30 '11 at 18:47
  • @Jon - I'm not exactly sure I understand the question. From what I'm reading, "locality of reference" refers to the concept that data that is accessed once will normally have the data around it accessed again soon. I'm not sure how that applied. Perhaps by just seeing the above Agent class, you missunderstand the use of such an object. Agents are not designed to be used by every variable in a program. Instead they are used in areas where parrallelism makes sense. For instance, in a physics simulation, one could use an Agent for each object being simulated not for one object per Vector or float – Timothy Baldridge Jan 31 '11 at 13:28
  • @Jon - Therefore if I access the state of an Agent, I could be accessing and modifying megabytes of information with a single .Deref. The key is to make these divides frequent enough to allow for massive parallelism, but not too often where you are having to may many calls to .Send() to accomplish a single task. – Timothy Baldridge Jan 31 '11 at 13:30
  • @Timothy: According to the Clojure docs, the state of the agent is immutable so you might be accessing megabytes of data but you should be modifying none of it. The only thing you modify is the reference to the immutable data. In F#, that can be accomplished with `ref`. However, if you want scalability then you're really after performance (specifically throughput) in which case you shouldn't be using any of this (no agents, no asynchronous workflows, no mailbox processors). Use Cilk-style parallel code with the TPL instead. – J D Jan 31 '11 at 16:25
  • @Jon - Scalability (and perhaps I'm using the wrong word here), the way I use it here, mean that if I can handle 100 agents on one CPU...perfect scalability should allow me to handle 400 agents on a Quad-core machine. Now, if we have all 400 agents pounding .Send() on a single agent, then yes, we will not see those levels of performance. But assuming uniform distribution of activity on the agents we should see reasonable performance gains. The idea behind Agents is that writes to the mutable state will not be changed underneath a .Send() lambda while it's running. – Timothy Baldridge Jan 31 '11 at 16:40
  • @Timothy: That is an appropriate use of "scalability" but your assumptions do not hold on multicore hardware (although it would be suitable for a supercomputer). Specifically, you are assuming that separate threads running separate computations on separate memory will see good scalability due to their apparent independence but that is not the case on a multicore because the cores all contend for a single shared memory. – J D Feb 01 '11 at 09:15
  • Removing explicit synchronization, as you say, is certainly a step in the right direction but reducing the number of cache misses is usually at least as important for performance and harder to do in practice because the contention is implicit. The best known way to do this is to write cache oblivious algorithms in the Cilk style, e.g. using the Task Parallel Library in .NET from Visual F# 2010. Failing to take this into account is a common mistake. I gave a lecture that mentioned this phenomenon here: http://skillsmatter.com/podcast/agile-testing/jon-harrop-qr-decomposition – J D Feb 01 '11 at 09:25
  • You may also be interested in this story I wrote about the world's foremost peer-reviewed research on parallel Haskell making the same basic mistake: http://flyingfrogblog.blogspot.com/2010/06/regular-shape-polymorphic-parallel.html – J D Feb 01 '11 at 09:30
  • 2
    I watched your lecture. The results are impressive, but fail to cover any topics of synchronization. You don't have a need for locks as the threads will never step on the toes of the other. In my simulation example we have the phsyics engine, the object behaviors (AI), and outside input (the user clicking an entity) all modifying the behavior of the objects. At any time, the GUI can also jump into the mix and read the state of all the objects and render them up. True, there will be cache misses, but the only other option is to use locks to syncronize behvior. And that would be a nightmare. – Timothy Baldridge Feb 01 '11 at 13:57
  • @TimothyBaldridge "...the only other option..." There are many options. I had a similar problem when working on a system that visualizes the results of numerical computations in a financial system. My solution was to have the number crunching code post updates to what I called a "decoupling" agent and have the GUI post messages to it requesting that it send an update to the GUI. – J D Jun 30 '12 at 20:37
1

There is no built-in implementation of the Clojure-style Agent.

I also at one point worked up a quick and dirty F# implementation similar to yours, but did not take the time to consider all the correctness issues involved; in particular, is it not true that 'T may be a value type (a struct) larger than 64 bits (or 32 bits as the case may be) which could cause a tear (I presume that Clojure like Java doesn't have structs to worry about here). Perhaps an F# generic type constraint ('T when 'T : not struct) would be needed?

Bertie77
  • 31
  • 1
  • you bring up an interesting point. And from I'm reading here http://blogs.msdn.com/b/brada/archive/2003/04/03/49896.aspx there could be other issues as well. My class above will instantly populate the data change to the other CPUs on the machine. To do that I need to use Interlocked.Exchange. And yes, I need a constraint against structs. Not much of a way around that one. – Timothy Baldridge Jan 28 '11 at 18:14
  • You can box and unbox yourself rather than passing the buck to the user via a constraint. Why `Interlocked.Exchange` though? In the absence of any synchronization, the result of reading the state conveys little useful information... – J D Jan 30 '11 at 18:51
  • "Perhaps an F# generic type constraint". Use a `ref` instead. – J D Jan 30 '11 at 19:03
  • From what I'm reading, although the CLR states that an update to a 32 bit value on a 32 bit system will be atomic, that value will not propigate immediately to all the CPUs in the system (that is only insured via .Exchange). So, it would be possible (although unlikely) that CPU1 would read and write the shared state, but then the agent thread would get migrated to CPU2. There the value would be read again, but that value would be stale, as CPU2's cache did not know about the memory update yet. Interlocked.Exchange alerts all CPUs that they need to refresh their cache before accessing the state – Timothy Baldridge Jan 31 '11 at 13:20