3

What are tradeoffs of representing state using a single atom and a hashmap vs multiple refs?

For example:

(def start (atom {:location "Chicago" :employer "John"}))

vs

(def location (ref "Chicago"))
(def employer (ref "John"))

Many thanks

muhuk
  • 15,777
  • 9
  • 59
  • 98
user2936410
  • 133
  • 1
  • 5

3 Answers3

3

The single atom version is better and has less tradeoffs. Given that you don't want to change the employer and the location uncoordinated, your win is that you don't have to create a dosync block to change either location or employer or both. Using the atom, you can simply (swap! start assoc :location "baz").

A big tradeoff of using multiple refs is that all transactions to refs will be tried in parallel and the first one who is ready wins, the others will be restarted. While that is also true for atoms, having more refs for all entries requires more monitoring, grouping (for dosync blocks) etc. behind the scenes. To have less restarts, it makes sense to group the information in a hash-map. Depending on whether coordinated change is required, put it in a ref or an atom.

Leon Grapenthin
  • 9,246
  • 24
  • 37
3

Multiple Refs allow for more concurrency, since all writes to an Atom are linearized. The STM allows for many parallel transactions to commit when there are no conflicting writes / ensures (and additionally it provides commute which allows one to make certain writes which would normally cause a conflict to not do so).

Additionally, the STM cooperates with Agents -- actions sent to Agents from within a transaction will be performed if and only if the transaction commits. This allows one to cause side effects from inside a transaction safely. Atoms offer no similar facility.

The trade-off is the STM's overhead is larger than an Atom's, plus there is the possibility of certain anomalies occurring (write skew, see the Wikipedia page on snapshot isolation). Additionally, it's possible to achieve great concurrency with the STM while having serious problems with obtaining a snapshot of the entire system; in this connection, see Christophe Grand's excellent blog post and his megaref library.

In many scenarios people find that just storing all state in a single Atom is enough and that's definitely a simpler, more lightweight approach.

Michał Marczyk
  • 83,634
  • 13
  • 201
  • 212
  • +1 for balanced answer, but this confused me a little; `Additionally, the STM cooperates with Agents -- actions sent to Agents from within a transaction will be performed if and only if the transaction commits.` - Won't actions be executed when the atom is swapped? This reads like actions will be executed regardless of the success. – muhuk Nov 08 '13 at 23:15
  • If you `send` from within the function passed to `swap!`, then yes, the `send` will be reexecuted on each retry. Try e.g. `(def a (atom 0)) (def ag (agent nil)) (do (future (swap! a (fn [x] (Thread/sleep 5000) (send ag println :foo) (inc x)))) (Thread/sleep 1000) (swap! a inc))`: here `:foo` will be printed twice; if you used a Ref instead of an Atom, it would only be printed once. – Michał Marczyk Nov 09 '13 at 09:44
1

I don't think you should be thinking about trade-offs between atoms vs. refs, since they're used for different situations.

You should use an atom when you want to change a single thing atomically.

refs use STM and involve many different things being changed simultaneously, in a transaction.

In your particular situation you should be answering the question about the thing you're changing.

  • Is it a single thing you want/can change in one step
  • Are different things you want/need to change transactionally

If you switch the old database for the new database and change everything together to change a single field, and so you say your database is an atom, you're abusing the mechanism.

Hope the distinction helps, for your example I would use an atom none the less.

There's a good summary here with motivations behind each strategy.

Community
  • 1
  • 1
guilespi
  • 4,672
  • 1
  • 15
  • 24