7

I have a binary decision tree. It takes inputs as an array of floats, and each branch node splits on an input index and value eventually taking me to a leaf.

I'm performing a massive number of lookups on this tree (about 17% of execution time according to performance analysis (Edit: Having optimised other areas it's now at almost 40%)), and am wondering if I could/should be using a different data structure to improve lookup speed.

Some kind of hash table can't be used, as inputs do not map directly to a leaf node, but I was wondering is anyone had any suggesting as to methods and data-structures I could use in place of the tree (or as well as?) to improve lookup speeds.

Memory is a concern, but less of a concern than speed.

Code is currently written in C#, but obviously any method could be applied.

Edit: There's a bit too much code to post, but I'll give more detail about the tree.

The tree is generated using information gain calculations, it's not always a 50/50 split, the split value could be any float value. A single input could also be split multiple times increasing the resolution on that input.

I posted a question about performance of the iterator here:

Micro optimisations iterating through a tree in C#

But I think I might need to look at the data structure itself to improve performance further.

I'm aiming for as much performance as possible here. I'm working on a new method of machine learning, and the tree grows itself using a feedback loop. For the process I'm working on, I estimate it'll be running for several months, so a few % saving here and there is massive. The ultimate goal is speed without using too much memory.

Community
  • 1
  • 1
Will Calderwood
  • 4,393
  • 3
  • 39
  • 64
  • dictionary with ordering which can be a map – Ryan Fung May 14 '13 at 09:29
  • 1
    You say you have a binary tree and the input at each node is a float - is your choice of child node based on `input < 0.5` or is there something more complex going on? Can you post some code? Also: 17% of execution time isn't very contextual - that could be very fast! Do you have a target you're aiming for, or more detail on the profiling that you can share? – Dan Puzey May 14 '13 at 09:30
  • Thanks Dan, I've added some more detail about the tree and targets. – Will Calderwood May 14 '13 at 09:44
  • Does it have a fixed depth? – Will May 14 '13 at 10:00
  • And what precision is needed for the floats? Is it enough to have just 256 graduations between 0 and 1, for example? – Will May 14 '13 at 10:06
  • It's not a fixed depth, and I need a higher resolution than 8 bit unfortunately. – Will Calderwood May 14 '13 at 10:40
  • I've now optimised other areas of code and it's over 30% of time is spent in that one function... I really could do with finding a faster way! – Will Calderwood May 14 '13 at 14:41

2 Answers2

2

If I understand correctly, you have floating point ranges than have to be mapped to a decision. Something like this:

       x <= 0.0      : Decision A
 0.0 < x <= 0.5      : Decision B
 0.5 < x <= 0.6      : Decision C
 0.6 < x             : Decision D

A binary tree is a pretty good way to handle that. As long as the tree is well balanced and the input values are evenly distributed across the ranges, you can expect O(log2 n) comparisons, where n is the number of possible decisions.

If the tree is not balanced, then you could be doing far more comparisons than necessary. In the worst case: O(n). So I would look at the trees and see how deep they are. If the same tree is used again and again, then the cost spent rebalancing once may be amortized over many lookups.

If the input values are not evenly distributed (and you know that ahead of time), then you might want to special-case the order of the comparisons so that the most common cases are detected early. You can do this by manipulating the tree or by adding special cases in the code before actually checking the tree.

If you've exhausted algorithmic improvements and you still need to optimize, you might look into a data structure with better locality than a general binary tree. For example, you could put the partition boundaries into a contiguous array and perform a binary search on it. (And, if the array isn't too long, you might even try a linear search on the array as it may be friendlier for the cache and the branch prediction.)

Lastly, I'd consider building a coarse index that gives us a headstart into the tree (or array). For example, use a few of the most significant bits of the input value as an index and see if that can cut off the first few layers of the tree. This may help more than you might imagine, as the skipped comparisons probably have a low chance of getting correct branch predictions.

Adrian McCarthy
  • 45,555
  • 16
  • 123
  • 175
  • Thanks for the answer. My next plan is to put the tree into an array, and see what kind of improvements I can get from cache locality. I like the sounds of the indexing using the most significant bits. I'll have to think about the best way of implementing that. The problems with cramming the tree into an array are 1. It's growing and 2. The eventual size will be many many gigabytes. – Will Calderwood May 14 '13 at 17:23
  • @Will Calderwood: If the tree is on the order of gigabytes, then I doubt cache locality will buy you much. Making sure the tree is balanced is probably the biggest win. You might also look at doing lookups in parallel on a multi-core machine (assuming the tree is static). – Adrian McCarthy May 14 '13 at 18:05
1

Presuming decisions have a 50/50 chance:

Imagine that you had two binary decisions; possible paths are 00, 01, 10, 11

Imagine instead of tree you had an array with four outcomes; you could turn your array of floats into a binary number which would be index into this array.

Will
  • 73,905
  • 40
  • 169
  • 246
  • Interesting thought. If I understand you correctly though, I'd still need to iterate thought the tree to generate the binary number to get the index in the array. I don't see how I can generate the number without iterating the tree. – Will Calderwood May 14 '13 at 09:47
  • @WillCalderwood yes I was presuming a 50/50 chance meaning you didn't need to visit the node to know the split. You've now expanded the question. – Will May 14 '13 at 10:01