C++ async only uses 2 cores

Question

I am using async to run a method simultaneously, but when I check my CPU, it shows that only 2 of 8 are in use. My CPU utilization is about 13%-16% the whole time. The function async should create a new thread with every call and thus should be able to use more processors or did I understand something wrong?

Here's my code:

for (map<string, Cell>::iterator a = cells.begin(); a != cells.end(); ++a)
{
    for (map<string, Cell>::iterator b = cells.begin(); b != cells.end(); ++b)
    {
        if (a->first == b->first)
            continue;

        if (_paths.count("path_" + b->first + "_" + a->first) > 0)
        {
            continue;
        }

        tmp = "path_" + a->first + "_" + b->first;
        auto future = async(launch::async, &Pathfinder::findPath, this, &a->second, &b->second, collisionZone);
        _paths[tmp] = future.get();
    }
}

Did I get the concept wrong?

EDIT:

Thanks guys, I figured it out now. I didn't know, that calling .get() on the future would wait for it to finish, which afterwards seems only logical...

However, I edited my code now:

    for (map<string, Cell>::iterator a = cells.begin(); a != cells.end(); ++a)
{
    for (map<string, Cell>::iterator b = cells.begin(); b != cells.end(); ++b)
    {
        if (a->first == b->first)
            continue;

        if (_paths.count("path_" + b->first + "_" + a->first) > 0)
        {
            continue;
        }

        tmp = "path_" + a->first + "_" + b->first;
        mapBuffer[tmp] = async(launch::async, &Pathfinder::findPath, this, &a->second, &b->second, collisionZone);
    }
}

for (map<string, future<list<Point>>>::iterator i = mapBuffer.begin(); i != mapBuffer.end(); ++i)
{
    _paths[i->first] = i->second.get();
}

It works. Now it spawns threads properly and uses all my cpu power. You saved me a lot of trouble! Thanks again.

I'm pretty sure there's no guarantees as to what decisions the OS make (at the language level). — Shoe, Jun 05 '14 at 20:01
"Did I get the concept wrong?" [Yes, you did](http://stackoverflow.com/q/600795/335858). — Sergey Kalinichenko, Jun 05 '14 at 20:01
^and if you dont wanna "reap" it, you have to spawn the async threads yourself — SwiftMango, Jun 05 '14 at 20:04

score 1 · Answer 1 · edited May 23 '17 at 12:27

std::async runs specified function asynchronously and returns immediately. That's it.

It's up to compiler how to do it. Some compilers create thread per async operation, some compilers have thread pool.

I recommend to read this: https://stackoverflow.com/a/15775870/2786682

By the way, your code does not really use std::async as you're making synchronous call to future.get just after 'spawning' the async operation.

score 1 · Accepted Answer · answered Jun 06 '14 at 09:43

To answer the underlying problem:

You probably should refactor the code by splitting the loop. In the first loop, you create all the futures and put them in a map indexed by tmp. In the second loop, you loop over this map and get all the values from each future, storing the results in _paths

After the first loop, you'll have a lot of futures running in parallel, so your cores should be busy enough. If cells is big enough (>numCores), it may be wise to just split the inner loop.

Thanks, your answer gave me the final bit I needed for understanding the async function. I also want to thank the others. — user2443761, Jun 06 '14 at 14:26

score 1 · Answer 3 · answered Jun 06 '14 at 10:07

YES, you did get it wrong. Parallel code requires some thoughts before writing any code.

Your code creates a future (which may and probably will spawn a new thread), and immediately after that, you force the newly created future to stop (call its .get()method), to synchronize, and have it returning a result.

So, with this strategy, your code will not utilize more than 2 cpu cores ever, at any point in time. It can't.

Actually, most of the time your code utilizes only a single core!

The trick is "to parallelize" your code.

C++ async only uses 2 cores

3 Answers3

Linked