3

I'm looking for a binary data structure (tree, list) that enables very fast searching. I'll only add/remove items at the beginning/end of the program, all at once. So it's gonna be fixed-sized, thus I don't really care about the insertion/deletion speed. Basically what I'm looking for is a structure that provides fast searching and doesn't use much memory.

Thanks

slartibartfast
  • 4,348
  • 5
  • 31
  • 46
  • What is the nature of your data? Could it be sorted? What is the size of it, and what are the memory constraints? – Haspemulator Aug 15 '11 at 15:17
  • Haspemulator, it's around five pointers, and I guess it could be sorted because each piece of data has a unique pointer. It's gonna have many nodes, on average probably around 50. – slartibartfast Aug 15 '11 at 15:32
  • @myrkos, so you need to search through 50 integers, or through 50 instances of structure with 5 pointers in it? – Haspemulator Aug 15 '11 at 15:40

6 Answers6

6

Look up the Unordered set in the Boost C++ library here. Unlike red-black trees, which are O(log n) for searching, the unordered set is based on a hash, and on average gives you O(1) search performance.

Michael Hays
  • 6,878
  • 2
  • 21
  • 17
4

One container not to be overlooked is a sorted std::vector.

It definitely wins on the memory consumption, especially if you can reserve() the correct amount up front.

Bo Persson
  • 90,663
  • 31
  • 146
  • 203
  • +1. Using `lower_bound` to find a element in a sorted vector essentially emulates the search behaviour of a `set`, but the `vector` is much more memory efficient, and thus the lookup will probably be much faster, too, because of memory locality. – Kerrek SB Aug 15 '11 at 16:05
  • Just make sure to do a binary search to find your data, and this is a good answer if you don't have to change the data at runtime, for a small (50ish) set. – Mooing Duck Aug 15 '11 at 16:05
2

So the key can be a simple type and the value is a smallish structure of five pointers.

With only 50 elements it starts getting small enough that the Big-O theoretical performance may be overshadowed or at least measurable affected by the fixed time overhead of the algorithm or structure.

For example an array a vector with linear search is often the fastest with less than ten elements because of its simple structure and tight memory.

I would wrap the container and run real data on it with timing. Start with STL's vector, go to the standard STL map, upgrade to unordered_map and maybe even try Google's dense or sparse_hash_map: http://google-sparsehash.googlecode.com/svn/trunk/doc/performance.html

MattiasF
  • 1,273
  • 1
  • 12
  • 15
0

The fastest tends to be a trei/trie. I implemented one 3 to 15 times faster than the std::unordered_map, they tend to use more ram unless you use a large number of elements though.

Exaeta
  • 300
  • 1
  • 4
0

One efficient (albeit a teeny bit confusing) algorithm is the Red-Black tree.

Internally, the c++ standard library uses Red-Black trees to implement std::map - see this question

Community
  • 1
  • 1
Nate
  • 12,499
  • 5
  • 45
  • 60
0

The std::map and hash map are good choices. They also have constructors to ease one time construction.

The hash map puts key data into a function that returns an array index. This may be slower than an std::map, but only profiling will tell.

My preference would be std::map, as it is usually implemented as a type of binary tree.

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154