C++ std::map creation taking too long?

Question

UPDATED:

I am working on a program whose performance is very critical. I have a vector of structs that are NOT sorted. I need to perform many search operations in this vector. So I decided to cache the vector data into a map like this:

        std::map<long, int> myMap;

        for (int i = 0; i < myVector.size(); ++i)
        {
            const Type& theType = myVector[i];
            myMap[theType.key] = i;
        }

When I search the map, the results of the rest of the program are much faster. However, the remaining bottleneck is the creation of the map itself (it is taking about 0.8 milliseconds on average to insert about 1,500 elements in it). I need to figure out a way to trim this time down. I am simply inserting a long as the key and an int as the value. I don't understand why it is taking this long.

Another idea I had was to create a copy of the vector (can't touch the original one) and somehow perform a faster sort than the std::sort (it takes way too long to sort it).

Edit:

Sorry everyone. I meant to say that I am creating a std::map where the key is a long and the value is an int. The long value is the struct's key value and the int is the index of the corresponding element in the vector.

Also, I did some more debugging and realized that the vector is not sorted at all. It's completely random. So doing something like a stable_sort isn't going to work out.

ANOTHER UPDATE:

Thanks everyone for the responses. I ended up creating a vector of pairs (std::vector of std::pair(long, int)). Then I sorted the vector by the long value. I created a custom comparator that only looked at the first part of the pair. Then I used lower_bound to search for the pair. Here's how I did it all:

  typedef std::pair<long,int> Key2VectorIndexPairT;
  typedef std::vector<Key2VectorIndexPairT> Key2VectorIndexPairVectorT;

  bool Key2VectorIndexPairComparator(const Key2VectorIndexPairT& pair1, const Key2VectorIndexPairT& pair2)
  {
      return pair1.first < pair2.first;
  }

  ...

  Key2VectorIndexPairVectorT sortedVector;
  sortedVector.reserve(originalVector.capacity());

  // Assume "original" vector contains unsorted elements.
  for (int i = 0; i < originalVector.size(); ++i)
  {
      const TheStruct& theStruct = originalVector[i];
      sortedVector.insert(Key2VectorIndexPairT(theStruct.key, i));
  }

  std::sort(sortedVector.begin(), sortedVector.end(), Key2VectorIndexPairComparator);

  ...

  const long keyToSearchFor = 20;

  const Key2VectorIndexPairVectorT::const_iterator cItorKey2VectorIndexPairVector = std::lower_bound(sortedVector.begin(), sortedVector.end(), Key2VectorIndexPairT(keyToSearchFor, 0 /* Provide dummy index value for search */), Key2VectorIndexPairComparator);

  if (cItorKey2VectorIndexPairVector->first == keyToSearchFor)
  {
      const int vectorIndex = cItorKey2VectorIndexPairVector->second;
      const TheStruct& theStruct = originalVector[vectorIndex];

      // Now do whatever you want...
  }
  else
  {
      // Could not find element...
  }

This yielded a modest performance gain for me. Before the total time for my calculations were 3.75 milliseconds and now it is down to 2.5 milliseconds.

You don't say what `Type` is. Maybe it has an expensive copy constructor? — rodrigo, Sep 19 '11 at 18:47
Sorry, "Type" is a struct that contains about 10 double values and a long value (the key). I don't even copy the Type into the map. Only the key and the corresponding index in the vector. — Andrew, Sep 19 '11 at 18:48
`faster sort than the std::find`. Did you mean faster than `std::sort`? — Mooing Duck, Sep 19 '11 at 18:49
The code isn't correct. std::map requires two template parameters, (the key and the value). — David Nehme, Sep 19 '11 at 18:53
Is `std::map` even legal? I thought you had to specify a minimum of two. — Flexo, Sep 19 '11 at 18:54
It took me a couple of minutes to even realize that the OP was declaring `std::map` when he really meant to declare `std::map`. — Kiril, Sep 19 '11 at 19:15
you can try to provide your version (or a better version) of the standard allocator. You may save time. Check loki library. However I doubt the bottleneck is the allocation on the heap. — Alessandro Teruzzi, Sep 19 '11 at 19:44
Are you saying that std::sort on a vector of pairs , which a comparator only comparing the "first" element of the pair) took longer than inserting them individually into a map? Or were you trying to sort the whole thing, with the 10 doubles? — David Nehme, Sep 19 '11 at 20:28
The updated code you posted using a sorted vector doesn't look like it would compile - you're calling insert() on the vector without an iterator. Is your real code calling push_back or are you inserting at the beginning (which would be quite slow)? — mattnewport, Sep 20 '11 at 17:12

mattnewport · Answer 1 · 2011-09-19T22:37:33.470

6

Both std::map and std::set are built on a binary tree and so adding items does dynamic memory allocation. If your map is largely static (i.e. initialized once at the start and then rarely or never has new items added or removed) you'd probably be better to use a sorted vector and a std::lower_bound to look up items using a binary search.

edited Sep 19 '11 at 22:37

answered Sep 19 '11 at 18:56

mattnewport

13,728
2
35
39

2

std::binary_search is the most misleading name in the standard library; it does a binary search all right, but only returns bool saying whether the item is in the vector or not. What you really want is **std::lower_bound**. – Mark Ransom Sep 19 '11 at 21:59
If you're creating a new vector of keys and indices for sorting i.e. `std::vector >` you'll want to call `reserve` on that new vector so there's only one memory allocation. – Mark Ransom Sep 19 '11 at 22:05
Ah yes, I forgot about the strange naming choices for binary_search. Edited to clarify. – mattnewport Sep 19 '11 at 22:38

bdonlan · Answer 2 · 2011-09-19T19:17:11.263

Maps take a lot of time for two reasons

You need to do a lot of memory allocation for your data storage
You need to perform O(n lg n) comparisons for the sort.

If you are just creating this as one batch, then throwing the whole map out, using a custom pool allocator may be a good idea here - eg, boost's pool_alloc. Custom allocators can also apply optimizations such as not actually deallocating any memory until the map's completely destroyed, etc.

Since your keys are integers, you may want to consider writing your own container based on a radix tree (on the bits of the key) as well. This may give you significantly improved performance, but since there is no STL implementation, you may need to write your own.

If you don't need to sort the data, use a hash table, such as std::unordered_map; these avoid the significant overhead needed for sorting data, and also can reduce the amount of memory allocation needed.

Finally, depending on the overall design of the program, it may be helpful to simply reuse the same map instead of recreating it over and over. Just delete and add keys as needed, rather than building a new vector, then building a new map. Again, this may not be possible in the context of your program, but if it is, it would definitely help you.

score 2 · Answer 3 · answered Sep 19 '11 at 18:58

I suspect it's the memory management and tree rebalancing that's costing you here.

Obviously profiling may be able to help you pinpoint the issue.

I would suggest as a general idea to just copy the long/int data you need into another vector and since you said it's almost sorted, use stable_sort on it to finish the ordering. Then use lower_bound to locate the items in the sorted vector.

score 1 · Answer 4 · answered Sep 19 '11 at 18:55

std::find is a linear scan(it has to be since it works on unsorted data). If you can sort(std::sort guaranties n log(n) behavior) the data then you can use std::binary_search to get log(n) searches. But as pointed out by others it may be copy time is the problem.

score 1 · Answer 5 · answered Sep 19 '11 at 18:55

1

If keys are solid and short, perhaps try std::hash_map instead. From MSDN's page on hash_map Class:

The main advantage of hashing over sorting is greater efficiency; a successful hashing performs insertions, deletions, and finds in constant average time as compared with a time proportional to the logarithm of the number of elements in the container for sorting techniques.

answered Sep 19 '11 at 18:55

Desmond Hume

8,037
14
65
112

I tried the stdext::hash_map data structure and I did not see any difference in performance. – Andrew Sep 19 '11 at 19:22

score 1 · Answer 6 · answered Sep 19 '11 at 19:00

Map creation can be a performance bottleneck (in the sense that it takes a measurable amount of time) if you're creating a large map and you're copying large chunks of data into it. You're also using the obvious (but suboptimal) way of inserting elements into a std::map - if you use something like:

myMap.insert(std::make_pair(theType.key, theType));

this should improve the insertion speed, but it will result in a slight change in behaviour if you encounter duplicate keys - using insert will result in values for duplicate keys being dropped, whereas using your method, the last element with the duplicate key will be inserted into the map.

I would also look into avoiding a making a copy of the data (for example by storing a pointer to it instead) if your profiling results determine that it's the copying of the element that is expensive. But for that you'll have to profile the code, IME guesstimates tend to be wrong...

Also, as a side note, you might want to look into storing the data in a std::set using custom comparator as your contains the key already. That however will not really result in a big speed up as constructing a set in this case is likely to be as expensive as inserting it into a map.

Kiril · Answer 7 · 2011-09-19T19:13:22.400

I'm not a C++ expert, but it seems that your problem stems from copying the Type instances, instead of a reference/pointer to the Type instances.

std::map<Type> myMap; // <-- this is wrong, since std::map requires two template parameters, not one

If you add elements to the map and they're not pointers, then I believe the copy constructor is invoked and that will certainly cause delays with a large data structure. Use the pointer instead:

std::map<KeyType, ObjectType*> myMap;

Furthermore, your example is a little confusing since you "insert" a value of type int in the map when you're expecting a value of type Type. I think you should assign the reference to the item, not the index.

myMap[theType.key] = &myVector[i];

Update:

The more I look at your example, the more confused I get. If you're using the std::map, then it should take two template types:

map<T1,T2> aMap;

So what are you REALLY mapping? map<Type, int> or something else?

It seems that you're using the Type.key member field as a key to the map (it's a valid idea), but unless key is of the same type as Type, then you can't use it as the key to the map. So is key an instance of Type??

Furthermore, you're mapping the current vector index to the key in the map, which indicates that you're just want the index to the vector so you can later access that index location fast. Is that what you want to do?

Update 2.0:

After reading your answer it seems that you're using std::map<long,int> and in that case there is no copying of the structure involved. Furthermore, you don't need to make a local reference to the object in the vector. If you just need to access the key, then access it by calling myVector[i].key.

For a map of containers, you also need to specify a comparison object in the template. — Mooing Duck, Sep 19 '11 at 18:53
@Mooing Duck, I think his example is all screwed up, because he uses the `Type.key` field as a key to the map, then he assigns an integer in the place of that key. Perhaps he means to keep map the key to an index in the vector where the object is being held, in that case he wouldn't need a comparison function. — Kiril, Sep 19 '11 at 18:58
Whoa, I typo'd. For associative containers who's keys are pointers, you need to specify a comparison object in the template. `std::map` Mark shows a demo comparison object. — Mooing Duck, Sep 19 '11 at 19:04
@Mooning Duck, I agree, but I don't even understand how he's trying to use it because the `std::map` takes to template parameters, not one. — Kiril, Sep 19 '11 at 19:08

andand · Answer 8 · 2011-09-19T19:28:46.783

0

Since your vector is already partially ordered, you may want to instead create an auxiliary array referencing (indices of) the elements in your original vector. Then you can sort the auxiliary array using Timsort which has good performance for partially sorted data (such as yours).

edited Sep 19 '11 at 19:28

answered Sep 19 '11 at 19:09

andand

17,134
11
53
79

score 0 · Answer 9 · edited May 23 '17 at 10:27

Your building a copy of the table from the broken example you give, and not just a reference.

Why Can't I store references in an STL map in C++?

Whatever you store in the map it relies on you not changing the vector. Try a lookup map only.

typedef vector<Type> Stuff;
Stuff myVector;
    typedef std::map<long, *Type> LookupMap;
    LookupMap myMap;
    LookupMap::iterator hint = myMap.begin();

    for (Stuff::iterator it = myVector.begin(); myVector.end() != it; ++it)
    {
        hint = myMap.insert(hint, std::make_pair(it->key, &*it));
    }

Or perhaps drop the vector and just store it in the map??

score 0 · Answer 10 · answered Sep 19 '11 at 20:06

I think you've got some other problem. Creating a vector of 1500 <long, int> pairs, and sorting it based on the longs should take considerably less than 0.8 milliseconds (at least assuming we're talking about a reasonably modern, desktop/server type processor).

To try to get an idea of what we should see here, I did a quick bit of test code:

#include <vector>
#include <algorithm>
#include <time.h>
#include <iostream>

int main() {

    const int size = 1500;
    const int reps = 100;

    std::vector<std::pair<long, int> > init;
    std::vector<std::pair<long, int> > data;
    long total = 0;

    // Generate "original" array
    for (int i=0; i<size; i++)
        init.push_back(std::make_pair(rand(), i));

    clock_t start = clock();
    for (int i=0; i<reps; i++) {
        // copy the original array
        std::vector<std::pair<long, int> > data(init.begin(), init.end());
        // sort the copy
        std::sort(data.begin(), data.end());

        // use data that depends on sort to prevent it being optimized away
        total += data[10].first;
        total += data[size-10].first;
    }
    clock_t stop = clock();

    std::cout << "Ignore: " << total << "\n";

    clock_t ticks = stop - start;
    double seconds = ticks / (double)CLOCKS_PER_SEC;
    double ms = seconds * 1000.0;
    double ms_p_iter = ms / reps;

    std::cout << ms_p_iter << " ms/iteration.";
    return 0;
}

Running this on my somewhat "trailing edge" (~5 year-old) machine, I'm getting times around 0.1 ms/iteration. I'd expect searching in this (using std::lower_bound or std::upper_bound) to be somewhat faster than searching in an std::map as well (since the data in the vector is allocated contiguously, we can expect better locality of reference, leading to better cache usage).

score 0 · Answer 11 · answered Sep 20 '11 at 00:46

Thanks everyone for the responses. I ended up creating a vector of pairs (std::vector of std::pair(long, int)). Then I sorted the vector by the long value. I created a custom comparator that only looked at the first part of the pair. Then I used lower_bound to search for the pair. Here's how I did it all:

      typedef std::pair<long,int> Key2VectorIndexPairT;
      typedef std::vector<Key2VectorIndexPairT> Key2VectorIndexPairVectorT;

      bool Key2VectorIndexPairComparator(const Key2VectorIndexPairT& pair1, const Key2VectorIndexPairT& pair2)
      {
          return pair1.first < pair2.first;
      }

      ...

      Key2VectorIndexPairVectorT sortedVector;
      sortedVector.reserve(originalVector.capacity());

      // Assume "original" vector contains unsorted elements.
      for (int i = 0; i < originalVector.size(); ++i)
      {
          const TheStruct& theStruct = originalVector[i];
          sortedVector.insert(Key2VectorIndexPairT(theStruct.key, i));
      }

      std::sort(sortedVector.begin(), sortedVector.end(), Key2VectorIndexPairComparator);

      ...

      const long keyToSearchFor = 20;

      const Key2VectorIndexPairVectorT::const_iterator cItorKey2VectorIndexPairVector = std::lower_bound(sortedVector.begin(), sortedVector.end(), Key2VectorIndexPairT(keyToSearchFor, 0 /* Provide dummy index value for search */), Key2VectorIndexPairComparator);

      if (cItorKey2VectorIndexPairVector->first == keyToSearchFor)
      {
          const int vectorIndex = cItorKey2VectorIndexPairVector->second;
          const TheStruct& theStruct = originalVector[vectorIndex];

          // Now do whatever you want...
      }
      else
      {
          // Could not find element...
      }

This yielded a modest performance gain for me. Before the total time for my calculations were 3.75 milliseconds and now it is down to 2.5 milliseconds.

C++ std::map creation taking too long?

11 Answers11

Update:

Update 2.0: