3

I'm using a red-black binary tree with linked leafs on a project (Java's TreeMap), to quickly find and iterate through the items. The problem is that I can easily get 35000 items or so on the tree, and several times I have to remove "all items above X", which can be almost the entire tree (like 30000 items at once, because all of them are bigger than X), and that takes too much time to remove them and rebalance the tree each time.

Is there any algorithm that can help me here (so I can make my own tree implementation)?

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
paulotorrens
  • 2,286
  • 20
  • 30
  • What else are you commonly doing with the data? There's always a trade-off, but you might be able to optimise the data structure/algorithms for your main usage patterns. – AndrewC Jul 14 '13 at 21:54
  • The map is used for a syntax highlighting component, it keeps track of each token and it's color, using a mapping from Integer to Token. Each drawing cycle I need to get a portion of the tree ("every token between positions X and Y"), and that's why the items are also linked, for fast iteration. Whenever the user types something on the component, every token after it is removed... so, if he's editing at the start of the file, lots of nodes have to be removed at once. If he's editing at the end of the file, just a few need to be removed from the tree. I'm trying to optimiza this behaviours. =/ – paulotorrens Jul 15 '13 at 20:59

1 Answers1

2

You're looking for the split operation on a red/black tree, which takes the red/black tree and some value k and splits it into two red/black trees, one with all keys greater than or equal to k and one with all keys less than k. This can be implemented in O(log n) time if you augment the structure to store some extra information. In your case, since you're using Java, you can just split the tree and discard the root of the tree you don't care about so that the garbage collector can handle it.

Details on how to implement this are given in this paper, starting on page 9. It's implemented in terms of a catenate (or join) operations which combines two trees, but I think the exposition is pretty clear.

Hope this helps!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065