37

How should I pick the backend storage service for Datomic?

Is it a matter of preference to select, say, DynamoDB instead of Postgres, or does each option have different tradeoffs? If so, what are they?

tosh
  • 5,222
  • 2
  • 28
  • 34
konr
  • 2,545
  • 2
  • 20
  • 38

1 Answers1

23

Storage Services Requirements

Datomic' storage services should generally meet 3 requirements:

  1. Implement key-value store semantics: efficient read/write access using indexed keys’ values
  2. Support consistent reads. e.g. read your own writes. Ideally, no-contention/lock-free reads.
  3. Support conditional puts. e.g. optimistic locking + snapshot isolation.

Datomic uses storages services to store blocks of sorted, compressed datoms, similar to the way traditional database systems use file systems and the requirements above are pretty much the API between the underlying storage service and Datomic. So the choice in storage services depend on how well they support those three requirements.

Write Scalability

Datomic doesn't usually put a lot of write pressure on the underlying storage service since there's only one component writing to it, the Transactor. Also, Datomic uses a background indexing job to integrate novelty into storage once enough of it has been accumulated (by default ~32MB but can be configured) which further reduces the constant write load. The only thing Datomic immediately writes is the transaction log.

Read Scalability

Datomic uses multiple layers of caching i.e. memcached and peers cache so in ideal circumstances i.e. when the working set fits in memory, the systems won't put a lot o read pressure either.

System Load

If your system doesn't require huge write scalability and your application data tends to fit in memory, then the choice of a particular storage service is irrelevant except, of course, for their operational capabilities (backups, admin tools, etc.) which have nothing to do with Datomic.

If, on the other hand, you system does require huge write scalability or you have a great number of peers, each of them working with more data than can fit in their memory (forcing a lot of data segments to be brought from storage), you'll require a storage system that can horizontally scale e.g. DynamoDB. As mentioned in one of the comments, if you need arbitrary write scalability, Datomic is not the right system for you anyway.

a2ndrade
  • 2,403
  • 21
  • 19
  • 1
    A note: If you truly need *huge* write scalability Datomic might not be the best choice. "Datomic trades off arbitrary write scalability to retain arbitrary transactions and joins, and has a strong data and query model, with arbitrary read and query scaling." http://www.datomic.com/faq.html – overthink Jul 29 '13 at 15:55
  • @overthink thanks for pointing that out since people unfamiliar with Datomic may not realize that's a design limitation. – a2ndrade Jul 29 '13 at 21:59
  • 1
    @a2ndrade Great answer. Do you happen to have an opinion on how each of the supported storage solutions stack up on these three requirements? – neverfox Apr 04 '15 at 19:36