14

I'm looking for an efficient way to implement a concurrent tree structure. If that helps, assume that I have a lot more read accesses than changes to the structure.

The tree should support these operations:

  • Adding and removing nodes
  • Sort branches every time a new node is inserted
  • Iterate over all the nodes (without ConcurrentModificationException)
  • Look up an element by path
Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820

4 Answers4

9

Take a look at: Concurrent-Trees on Google code for a way to modify tree-like structures without locking.

The project provides Concurrent Radix and Suffix trees for Java. They supports concurrent reads and writes, and reads are lock-free. It works by applying patches to the tree atomically. While these types of tree might not be exactly what you want, the approach using "patching" as described in TreeDesign is useful for any kind of tree-like structure.

The trees are intended for high concurrency read-mostly use cases, where (say) a background thread might be inserting or deleting entries from the tree while many foreground threads would continue to traverse it unimpeded by the modifications.

beardhatcode
  • 4,533
  • 1
  • 16
  • 29
npgall
  • 2,979
  • 1
  • 24
  • 24
  • This is interesting but useless in my use case. My trees are real trees (parent-child structures), not key-value maps. – Aaron Digulla Jul 03 '12 at 12:43
  • 3
    I'd avoid using words like "useless". This answer was contributed to help you voluntarily. If you are looking for read-mostly concurrent trees, lock-free algorithms can be a good way to go. A radix tree might not be exactly what you are looking for, but as I mentioned, the atomic patching approach I linked to, can be applied to any type of tree for lock-free reads, even if you are planning to write your own custom tree. Incidentally radix trees are obviously trees, having parent-child structures. – npgall Jul 03 '12 at 22:56
  • The new wording looks good. I also removed the final paragraph about documentation as I've finished adding remaining documentation to the site. – npgall Jul 04 '12 at 13:16
  • This is the solution which we finally used. Major drawbacks: Memory consumption and change operations are pretty expensive since you need to copy the tree every time. But the pros outweight them by far: Simple code structure, no deadlocks, no read locks, it's very easy to merge several tree changes into one operation (so I need only one copy), implementation doesn't leak, guaranteed to be thread safe, no corner cases. – Aaron Digulla Jul 06 '12 at 10:30
  • Depending on the type of tree, it might not be necessary to copy the entire (sub-)tree for every modification. Some trees (e.g. radix trees), have locality of reference (LOR), so modifications mid-branch can require only a few nodes to be allocated for the patch, and descendants of nodes being replaced, can usually be attached directly to the patch unmodified. In trees without LOR, writes would be more expensive for sure. Since your tree can be modified by customers, try asking them nicely to observe locality of reference! I'm glad this helped, best of luck with your project. – npgall Jul 06 '12 at 15:15
  • Optimizations are possible but not necessary - that's what I like about the approach. If the trees turn out to be too slow, I *can* replace the simple 5-line recursive copy into something more complex - but I don't *have* to :-) – Aaron Digulla Jul 09 '12 at 07:45
5

The Java structure that is the closest to what you may need is a ConcurrentSkipListSet (or maybe ConcurrentSkipListMap).


If you need a more custom approach, you can implement a custom tree structure if you have a hierarchic read-write lock. Here is an answerto a similar question about how to implement a reentrant read-write lock: https://stackoverflow.com/a/6154873/272388

Community
  • 1
  • 1
Kru
  • 4,195
  • 24
  • 31
  • 1
    The CSLM is not really close. It is a *sorted map* having little to do with a tree structure. So it only shares a common property with the `TreeMap` in that it is a sorted map -- which is not relevant here. – Marko Topolnik Jun 25 '12 at 13:16
  • That's why there is a second part of the answer as I don't know any tree structure included in Java that supports all of these operations. Also, CSLM might be enough for some similar cases. – Kru Jun 25 '12 at 13:22
  • 1
    Kru is still correct: It's the only data structure in the official runtime which is closest to my problem. I could build the tree using nested ConcurrentSkipListMaps + locks. – Aaron Digulla Jun 25 '12 at 13:22
  • Is there any reason it must be tree? hmm, a concurrent (lock-free) treemap is a research topic. Ya, we have COW B-Tree, but they are crazy. – J-16 SDiZ Jun 25 '12 at 13:22
  • 1
    @AaronDigulla Wouldn't a `ConcurrentSkipListSet` be a better tool to implement a single node's children container? – Marko Topolnik Jun 25 '12 at 13:24
  • @AaronDigulla I thought about it some more -- wouldn't the CSLS in fact cover all the necessary locking, so your code would not need any kind of explicit locking mechanism? All insertions and deletions come down to insert/delete in the list of children. – Marko Topolnik Jun 25 '12 at 15:30
3

You could use a reader-writer lock in your structure, in a way that multiple threads can read cuncurrently but only one thread at a time can modify it. If some thread tries to modify the structure it cannot do it until all of the readers have done their reads. If a thread wants to read it can only if a writer isn't already working, or it is on shedule of doing some modifications. Maybe a look at this could help:

http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/ReentrantReadWriteLock.html

giocarmine
  • 580
  • 1
  • 13
  • 29
1

Basic idea with Read/Write lock

For all write methods use following idiom:

someWriteMethod(){
  writeLock.lock();
  // logic
  writeLock.unlock();
}

For all read methods use similar code:

someReadMethod(){
  readLock.lock();
  // logic
  readLock.unlock();
}
  • If at least one method performs writing, no one can obtain read or write lock.
  • If at least one method performs reading, any numbers of thread can obtain readlock.
  • If at least one method performs reading, no one can obtain write lock.

Note, if your code (replace logic comment above) can throw exception make sure you release locks before exit from method in finally section.

mishadoff
  • 10,719
  • 2
  • 33
  • 55
  • The read-write lock has one disadvantage: You can't "upgrade" a read to a write lock, so you need to know in advance whether your operation will modify the tree. So that would need two iterators where one might lock way too much. – Aaron Digulla Jun 25 '12 at 13:19
  • I assume that using trees you know in advance what operations modify your tree, isn't it? – mishadoff Jun 25 '12 at 13:20
  • When I ask the tree for an iterator, the method that creates the iterator can't know whether I'll call remove() or not. – Aaron Digulla Jun 25 '12 at 13:20
  • Is your tree very big? How about returning the copy? – mishadoff Jun 25 '12 at 13:22
  • I'm wondering myself. My trees usually aren't big but there is no artificial limit to them so customers can make them as big as they want :-/ – Aaron Digulla Jun 29 '12 at 08:52