I need to design an interval search algorithm that works on 64-bit keys. The match is when key k is between k1 and k2. An important requirement is that the lookup speed is better than O(log n). Researching available literature didn't turn up anything better than interval search trees. I wonder if it's feasible at all.
-
Impossible. You can't even perform exact search in O(1) (which your algorithm would obviously implement as special case) - you may be thinking of hashtables, but O(1) lookup there just means the hash function is O(1), not the overall amortized complexity of lookups (remember , they will degrade to O(n) if not rehashed, which itself is O(n), etc) – BadZen May 07 '15 at 19:42
-
No O(1), you may find this interesting to find the best data structure for your specific requirements http://stackoverflow.com/questions/17466218/what-are-the-differences-between-segment-trees-interval-trees-binary-indexed-t – Alex May 07 '15 at 19:50
-
@OutputLogic - can you give a simple example of inputs and desired output(s)? – hatchet - done with SOverflow May 07 '15 at 19:51
-
2Of course it can be done, if you stop thinking about it as a set of ranges and instead treat it like a gigantic set of booleans. For example a trie. O(keysize) is constant time if you have 64 bit keys. (64 is more than the log of any reasonable number of ranges though) – harold May 07 '15 at 20:11
-
@hatchet : This is a well-known communication protocol. An input is 64-bit address. An interval is the range of addresses. Each range is associated with a 32-bit command. An incoming packet contains an address. If it falls into the range, the output is a command associated with the packet. – OutputLogic May 07 '15 at 23:37
-
@harold That answer captures time requirements just as much as saying breaking AES-256 is constant time (which is also true, but practically irrelevant). – G. Bach May 08 '15 at 12:31
-
@G.Bach well unlike breaking AES-256, it could actually be done in practice too. It just wouldn't be particularly great. – harold May 08 '15 at 12:35
-
assuming you could store a 2^64 length array you could create it and directly look up an address. looks like its too large to initialize on my computer, so I would suggest chunking up the array unfortunately by doing that it brings you right back to O(log n). O(1) implies direct lookup so complete initialization of the array is your only hope – ZJS May 08 '15 at 13:27
2 Answers
If your keys have distribution, closed to uniform, you can use Interpolation search, which has O(log log N) time - this is much better, than O(log n).
UPD: Just an idea: If you have enough extra memory, you can build trie-like structure. There will be O(1) search time. Idea following: For example, lets we set tree of arrays[256], where each array indexed by some byte of key. Arrays linked to trie. So, root element of trie - is array[265], where index is high byte of the key. But anyway this is not practical, because of in the bottom node, for search borders, need to perform linear search with ~64 iterations.

- 3,670
- 1
- 20
- 19
You can dispatch by leading bytes until the problem is small. That avoids most of the overhead of an interval tree, while maintaining the flexibility of one.
So you have a table of 256 structs that point to 256 structs on down as far as needed until you either encounter a flag saying, "no match", or you are pointed to a small interval tree for the exact matching condition. Processing the top of this tree with straightforward jumps rather than traversing multiple comparisons, possible pipeline stalls, etc, may be a significant performance improvement for you.

- 43,296
- 3
- 59
- 88