1

I looked at this question: boost::flat_map and its performance compared to map and unordered_map and discovered that v.oddou's answer shows iterator plots for std::map versus other containers and I discovered that std::map is very slow at iterating over all the elements. What puzzled me about this is I assume the reason std::map is slow at iterating is because it presents all elements in key order. I believe that if std::map was constructed as an array and using the special index formula (I do not remember where its referenced) for keeping track of the binary tree structure, then you could iterate quick through the array sequentially, but the keys will be presented unordered. However, I have been unable to find a standard container that does this in c++. The only bottleneck for an array like map is that dynamic resizing will be expensive, however if you are only interested in doubling the array size each time you hit the size limit, you can achieve N*log(N) average insertion time over a long computation. I looked at the pseudo code that explains the construction of nodal pointers that have tree association directions (parent, left child, right child) and I was unable to determine if this nodal structure can be modified to add a one-dimensional unordered associativity.

Is there a c++ standard library container of size N that has log(N) insertion and search yet has N iteration instead N*log(N)?

Edit:

There appears to be confusion regarding insertion complexity, where some have suggested its N and not N*log(N) for ordered maps. In my OP I did not make it clear, but I was thinking of insertion of one element at a time where you do not know what the next element will be. I believe that if you have all of the elements at once you could probably do it N time. My source is map::insert.

linuxfreebird
  • 799
  • 2
  • 8
  • 16

3 Answers3

2

I believe that if std::map was constructed as an array and using the special index formula (I do not remember where its referenced) for keeping track of the binary tree structure, then you could iterate quick through the array sequentially, but the keys will be presented unordered.

This would come at a price of invalidating iterators and pointers due to re-allocation when array needs growth. Also iterators and pointers may invalidate on any insertion due to re-balancing.

Iterators and pointers invalidation makes it impossible to use such structure for std::map.

The only bottleneck for an array like map is that dynamic resizing will be expensive

Also re-balancing on insertion/deletion is somewhat more complex, you'll have to move elements data around (probably caring about exception safety). Yet, such a container may be useful.


If you want linear iteration, you can always use intrusive double-linked list for elements of your std::map. See boost::intrusive::list. Though this "linear" will not be as fast as iterating over an array, since continuous array is more cache-friendly.

Alex Guteniev
  • 12,039
  • 2
  • 34
  • 79
1

Is there a c++ standard library container of size N that has log(N) insertion and search yet has N iteration instead N*log(N)?

Both ordered and unordered map / set satisfy these requirements. However, insertion into unordered container is only constant in average case, and linear in worst case.


I believe that if std::map was constructed as an array and using the special index formula (I do not remember where its referenced) for keeping track of the binary tree structure, then you could iterate quick through the array sequentially, but the keys will be presented unordered. However, I have been unable to find a standard container that does this in c++.

That's because such container is not in the standard library.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • "Also, such container would have linear complexity for insertion, because" . No that's not true. There are definitely data structures that can manage all those requirements. An AVL tree using arrays for backing storage for example. The stl has requirements that make that impossible to use though. – Voo Mar 23 '20 at 21:51
  • @Voo OK. I take your word for it. I removed the paragraph. – eerorika Mar 23 '20 at 22:13
1

You may be wanting to use an std::vector and execute std::sort on it, then lookup for your keys with std::binary_search.

https://en.cppreference.com/w/cpp/algorithm/binary_search

(If you only ever need access to max element, you may also use make_heap, or priority_queue)

The constraint is that you'll need to have a clear time separation between all insertions and THEN, queries. If you intertwine queries with insertions the complexity you'll pay is off your requirements because of the need to re-sort.

(not true for the heap case, push_heap will preserve your requirements)


In case the above 2 possibilities are too constraining for your requirements, your last chance is to design an in-house allocator for map. It's not that hard, especially if you never delete elements, you'll be finished in no time. Your allocator should work like the underlying allocation strategy of vector. Just add at the end, and copy all to new reallocated space when full. With that technique you can create a custom iterator that doesn't follow the node pointers, but just do += sizeof(node)

v.oddou
  • 6,476
  • 3
  • 32
  • 63
  • Thanks. That is a good answer for my OP. Unfortunately for my specific case I am dealing with insert and find in very short time intervals. I am dealing with something similar to sparse to sparse matrix multiplication, but the matrix multiplication is something different called operator multiplication, which does not exist in a standard library and I need to encode it by hand. I have encountered bottlenecks where memory allocation in std::map is allocating too frequently and its causing a slow down. – linuxfreebird Mar 25 '20 at 04:35
  • Do you really need to have ordered elements? If not, I recommend to use google::sparse_map or even my own open address hash map: https://sourceforge.net/projects/cgenericopenaddresshashmap/ – v.oddou Mar 25 '20 at 06:16
  • Thanks I will take a look at your suggestions. The only reason I need to have ordered elements is to print the elements in order, though that is not need for the algorithm directly, and I need to find elements fast. I have a question is there anything wrong with https://github.com/mpaland/avl_array ? I am scared to use hash maps, because my data is random and I am worried hash maps will run into many collisions. – linuxfreebird Mar 25 '20 at 12:48
  • the AVL thing should be good. go for it and see if you have problems. the only thing is this particular implementation mentions 'static memory' I don't think it does the auto re-expansion you wanted. – v.oddou Mar 25 '20 at 15:38
  • Thanks. Your right about the auto re-expansion. I am currently exploring the idea of modifying the code to make it auto re-expansion. I am very surprised that there appears to be no standard library for something similar to avl_array. – linuxfreebird Mar 25 '20 at 15:59