Union of map with adding up the corresponding values with the same key

Question

I have two std::map<char, int> A1, A2;, and I'd like to merge these two maps. The question is, how can I write the most efficient code to adding up the values to the same key in the two maps?

For example:

A1: ('a', 10), ('c', 8), ('z', 7)
A2: ('c', 12), ('q', 9), ('s', 2), ('u', 8)

A3 = merge(A1, A2); // A3 should be: ('a', 10), ('c', 20), ('q', 9), ('s', 2), ('u', 8), ('z', 7)

UPDATE: What I can think of is as follows:

std::map<char, int>::iterator it;
for (it = A2.begin(); it != A2.end(); it++)
{
    A1[ it->first ] += it->second;
}

Is there any better ways?

There is no such thing as most efficient code. There may be such thing as most efficient code for given input, architecture, amount of available memory, speed of installed memory, cpu cache size, cpu clock speed, and phase of the moon. — n. m. could be an AI, May 11 '14 at 13:18
@n.m.: OTOH, it's good to ask if there's a more efficient algorithm given a data structure, since there *is* obviously slower code and common pitfalls to avoid (stupid example: pushing elements in a vector on the front instead of the back). — Matteo Italia, May 11 '14 at 13:32
The code you have written looks good and is very readable. So I would suggest you go ahead and write that and let compiler worry for generating most efficient version. — Mohit Jain, May 11 '14 at 13:32
@n.m. I am intrigued by your lunar phase algorithmic efficiency improvements and would like to learn more! — Rook, May 11 '14 at 13:53
@Rook see [here](http://www.catb.org/jargon/html/P/phase-of-the-moon.html). Who knows, maybe your CPU firmware has something like that... — n. m. could be an AI, May 11 '14 at 15:12
@n.m. I am disappointed that there remain no examples even after all this time. — Rook, May 11 '14 at 15:15
Perhaps you could mark one of the answers as acceptable to close this question, or point out where they are deficient? — Rook, May 18 '14 at 10:20

score 0 · Answer 1 · answered May 11 '14 at 13:22

Try the following

#include <iostream>
#include <map>

int main() 
{
    std::map<char, int> m1 = { { 'a', 10 }, { 'c', 8 }, { 'z', 7 } },
                        m2 = { { 'c', 12 }, { 'q', 9 }, { 's', 2 }, { 'u', 9 } };


    for ( const auto &p : m2 ) m1[p.first] += p.second;

    for ( const auto &p : m1 ) std::cout << p.first << '\t' << p.second << std::endl;

    return 0;
}

score 0 · Answer 2 · edited May 23 '17 at 10:33

Well. Because you are using a std::map, the order of iteration through the collection is well defined (see Is the order of iterating through std::map known (and guaranteed by the standard)?)

This means you can do a fairly simple once-through iteration of your input maps, without having to do a single lookup in either of them. Advancing a normal forward iterator should be an O(1) step, I believe.

Insertion is interesting... according to cppreferece, you can supply a hint parameter to std::map::insert, which makes insertion

Amortized constant if the insertion happens in the position just before the hint, logarithmic in the size of the container otherwise.

Now, assuming I'm not being totally daft here, as we're inserting elements in order, we'll always be inserting at the very end of the map. Therefore, the hint can always be a3.end(). Prior to C++11, you could keep the return result of the last insert operation, which will be the element immediately before the end, and hence immediately before the next element to be inserted and use that as your hint instead.

Quick example code, may or may not work, E&OE, YMMV, etc!

std::map<char, int> a1, a2, a3;
// populate a1 and a2 yourself here. elided for brevity.

auto a1i = a1.begin();
auto a2i = a2.begin();

while(a1i != a1.end() && a2i != a2.end())
{
    if (a1i->first < a2i->first)
    {
            a3.insert(a3.end(), *a1i);
            a1i++;
    }
    else if (a1i->first == a2i->first)
    {
            a3.insert(
                a3.end(),
                std::make_pair(a1i->first, a1i->second + a2i->second));
            a1i++;
            a2i++;
    }
    else if (a1i->first > a2i->first)
    {
            a3.insert(a3.end(), *a2i);
            a2i++;
    }

    continue;
}

// push remaining elements, if any, into a3.
// this is O(NlogN), but you should use hinted insert as above to make it O(N)
a3.insert(a1i, a1.end());
a3.insert(a2i, a2.end());

Note: compared to your example code, this way avoids a tree lookup per iteration, and can be used with immutable maps as the input. Best case performance for my code should be of the order of O(N) (amortized, and assumes I've used insert hints correctly) , compared to the O(NlogN) of yours. The final insert code above was left in the less efficient form simply for brevity.

Now, there's also the possible issue of memory allocation. If either (or both) of your source maps are very large, it would be nice if you could pre-allocate a chunk of memory suitable to fit at least the larger of the two maps. std::map doesn't have a reserve method, however. If you were to use a suitable allocator that supported preallocation of a pool of memory for your new map instance, you might squeeze a tiny bit more performance out of the system.

Ultimately, unless you provide some metric for efficiency, there's a limit to how useful any answer can be ;-)

Union of map with adding up the corresponding values with the same key

2 Answers2