2

I would like to implement a ring buffer using purely functional data structure with the following operations

  • Efficient random access by index
  • Add to front
  • Remove from back

The reason for using persistent data structure is because I have one writer thread and several reader thread and I want to avoid the readers blocking the writer.

This can be done easily by having readers thread only hold the lock while taking a snapshot and then using the snapshot for processing.

What are the existing available data structure that support those operation?

  • Doubly-linked list can't do index lookup efficiently and are O(n)
  • Clojure PersistentVector based on Phil Bagwell Ideal Hash Tree, support access by index in log32N and subvec can be used to remove element from the start.
  • Hash array mapped trie could also be used by storing integer as key but might not be super efficient.

Which other purely functional data structure could be used in this case?

Kevin Ji
  • 10,479
  • 4
  • 40
  • 63
skyde
  • 2,816
  • 4
  • 34
  • 53
  • Can you describe a use-case for this? It sounds like you have contradictory goals. – Alan Thompson Oct 19 '18 at 18:44
  • Readers still have to block the writer; you can't risk that a write update will invalidate an index during a read. – chepner Oct 19 '18 at 19:06
  • Or more accurately, an *existing* read blocks a writer, but a read request cannot preempt a pending write request, to avoid starvation. – chepner Oct 19 '18 at 19:09
  • chepner: the object is immutable and never updated in place. The writer cannot modify it it can only create a new version that share some object with the old version. The reader and the writer will simply lock while updating the pointer that point to the latest version. No lock is needed to use the snapshot after the snapshot is taken. – skyde Oct 19 '18 at 21:16
  • Why do you need random access? It seems that without it, you could just use one of the several [concurrent queues](https://stackoverflow.com/q/27933941/1333025). – Petr Oct 20 '18 at 08:35
  • yes I need random access, because several reader are reading from different position concurrently. – skyde Oct 21 '18 at 03:59

1 Answers1

5

The finger tree (in the standard library as Data.Sequence) is the go-to for persistent random-access sequences. I think it satisfies your criteria––random access indexing is O(log n) (more specifically, the log of the index's distance from the edge), the others are O(1). I'm not aware of any persistent data structures that do better than that.

luqui
  • 59,485
  • 12
  • 145
  • 204
  • thanks a lot luqui, so finger tree are better than clojure PersistentVector? – skyde Oct 19 '18 at 21:19
  • After a little reading, it seems like clojure persistent vectors are just 32-ary trees. Finger trees will be better at push and pop at both ends; random access should be comparable. – luqui Oct 19 '18 at 22:12
  • 2
    @skyde, the finger trees in `Data.Sequence` have fairly bad constant factors. You might be better off eating a (small) logarithmic cost for access to the ends in exchange for better random access from a bushier tree. It all depends on your usage patterns of course. – dfeuer Oct 20 '18 at 04:08
  • @dfeuer could you clarify what you mean when you say Data.Sequence have fairly bad constant factors. Do you mean that random access will be slow compared to 32-ary tree ? – skyde Oct 21 '18 at 04:00
  • 1
    Clojure persisten vector cannot be used as a queue because subseq maintains a reference to the entire underlying vector, so none of the items being "popped" from the back will ever be garbage collected :( – skyde Oct 21 '18 at 04:14
  • would using Data.Sequence as a queue cause the same problem when removing element? – skyde Oct 21 '18 at 06:16
  • No, `Data.Sequence` shouldn't suffer from that problem; it doesn't do slicing. Yes, random access will generally be slower than for a 32ary tree. Also, a tree designed for reasonably efficient pushes and pops from either side is likely slower than one designed for optimal queue performance (when that's all you need). – dfeuer Oct 21 '18 at 07:18