0

I am trying to come up with techniques of accessing/retrieving an object from a container (map, vector, ) in the most efficient manor possible.

So if I have the object:

class Person
{
   public:
       string name;
       unsigned int ID; // unique ID
       double deposit;
 };

// And then I have a vector of pointers to person objects
std::vector <Person*> people;

Person* getPerson( string nName );
Person* getPerson( unsigned int nID ); // what would be a good method to quickly identify/retrieve the correct Person object from my vector?

My ideas:

This is the iterative solution that is not efficient:

Person* getPerson( string nName )
{
   for (int i=0; i<people.size(); i++)
   {
      if (people[i]->name == nName ) { return people[i]; }
   }
}

Another way: have 2 maps

map <string, Person*> personNameMap;  

Person* getPerson( string nName )
{
   return personNameMap[nName]; 
}

map <string, Person*> personIDMap;
Person* getPerson( unsigned int nID )
{
   char id[2];
   atoi( nID, id, 10 );  // or is it itoa?
   return personNameMap[id]; 
}

Any other ideas how I could store & retrieve my objects from a collection in a fast & efficient manor?

Erik
  • 88,732
  • 13
  • 198
  • 189
user593747
  • 1,484
  • 4
  • 32
  • 52

3 Answers3

1

std::map stores its element in a balanced tree structure and provides quite good look up speed. But inserting in std::map is slower then in sequence containers for the same reasons. So map is your choice if yoh have a lot off look ups and quite small amount of insertions.

Besides that I don't understand exactle why you made map <string, Person*> personIDMap; instead of map <unsigned int, Person*> personIdMap.

beduin
  • 7,943
  • 3
  • 27
  • 24
  • Thanks for the reply. Yeah I am new to maps, I am thinking of them as the Python Dictionary so I have to get OUT of my head that only string can be used as keys. Any suggestions of what techniques you would use for fast lookup/retrieval of pointer objects? Or would you do the 2 ordered maps (by name(string as the key)) & (by ID(unsigned int as the key))? – user593747 May 05 '11 at 11:16
  • Depend on your purposes you may use two `map` or solution, proposed by Mayank. If you use `by name` lookup more often, then `two maps` seems better, if frequency of using `by name` and `by id` look up is equal, then `map + vector` solution is better. In both cases you have to provide consistency (between two maps or between map and vector). – beduin May 05 '11 at 11:38
1

std::map is a balanced tree that is O(log n) steps for searching. Boost offers boost::unordered_map which is a hash-map. It is asymptotically worse (O(n^2)), however, on average it performs better. Depending on the fullness of the container, it is 1-3 constant steps. Once the container gets filled (which means that the values of the keys get exhausted) there will be more and more collisions and the performance will degrade quickly. In most implementations this happens at around 80% fullness. This is not a problem in most cases, but be aware of this limitation.

Tamás Szelei
  • 23,169
  • 18
  • 105
  • 180
  • Thanks for the reply. When you say full, & say for example we are talking about a map that uses strings for the keys, how big would the map be when performance starts to go down. I know you said 80% but in terms of size, would that be 10000 if the string keys had to be <= 3 chars in length? – user593747 May 05 '11 at 11:21
  • Most compilers do provide hash map, either under `std::tr1::unordered_map` or under old SGI name `std::hash_map` (though VisualC++ differs in signature a little from the SGI version). Compilers supporting C++0x provide it as `std::unordered_map`. This is usually the same as the boost version. – Jan Hudec May 05 '11 at 11:27
  • It's not a problem with `map` because that a balanced tree (red-black tree, I think, but I'm not sure). I don't think that 10000 would be a problem with unordered_map. (10000/(256^3) is waaay smaller than 80%). – Tamás Szelei May 05 '11 at 11:31
  • @Jan: Thanks for the clarification, I was not aware of the implementation details. – Tamás Szelei May 05 '11 at 11:33
  • ás, @user593747: It will *never* be problem. A collision happens when the *hash values* for two distinct keys are the same. The hash values are `size_t` (modulo the container size, but that will grow as needed) so you are guaranteed to run out of memory before you approach exhaustion of the hash value space. The fact that your actual key space is smaller than that does not cause any extra conflicts. – Jan Hudec May 05 '11 at 11:37
0

Map is the fastest container for look up in C++ if index is not integers I hope that is good

Mayank
  • 5,454
  • 9
  • 37
  • 60
  • Ok, but I want to be able to look up Person's by their name & their ID(number) so what would you do - create 2 containers, a map when looking up by name, & a vector/array when looking up by ID. Or would you use a templated function to do this? I am trying to learn techniques for organising & retrieving objects from collections. – user593747 May 05 '11 at 11:09
  • Keep two containers. One map nameToIdMap. And a vector. But, the id and indexes should match. When you want look up with it just return objVec[index]`(Where index should be id)`. In case of name search for index from the map using name as key, and return objVec[index]. But for these to work efficiently, you need to know the size of vector so that it can be initialized before filling up. Then fill the vector like objVec[id] = "Person" and avoid ppush_back. – Mayank May 05 '11 at 11:17