1

I want to asynchronously process a std::vector<int> within my program. The retrieval of the objects, however, do not utilize move semantics.

I've created a minimal working example and attached it below. According to cplusplus.com, move semantics on copying and constructing vectors are linear in the size of the moved-to object and constant in the size of the moved-from object, as long as they share the same allocator (I'm using the standard one). According to cplusplus.com again, when retrieving an object from std::future<T>::get() and T is neither void nor a reference type (which is not the case), it behaves like moving the value.

I even tried to just use inputs[i].get(); instead in the second loop, currently within the comments, and not assigning it to anything. This still give the linear time increase.

std::vector<std::vector<int>> GenerateTestCases(int input_size, int number_inputs) {
    std::vector<std::vector<int>> cases(number_inputs);

    for (auto i = 0; i < number_inputs; i++) {
        std::vector<int> some_vector(input_size);
        cases[i] = std::move(some_vector);
    }

    return std::move(cases);
}

int main() {
    for (auto i = 0; i < 25; i++) {
        auto size = (int)pow(2, i);
        int iterations = 100;

        auto test_cases = GenerateTestCases(size, iterations);

        std::vector<std::future<std::vector<int>>> inputs(iterations);

        const auto start = std::chrono::high_resolution_clock::now();

        for (auto i = 0; i < test_cases.size(); i++) {
            std::promise<std::vector<int>> prom;
            prom.set_value(std::move(test_cases[i]));

            inputs[i] = std::move(prom.get_future());
        }

        const auto middle = std::chrono::high_resolution_clock::now();

        for (auto i = 0; i < test_cases.size(); i++) {
            //inputs[i].get();
            auto& result = (inputs[i]);
            auto value = std::move(result.get());
        }

        const auto end = std::chrono::high_resolution_clock::now();

        const auto elapsed_first = std::chrono::duration_cast<std::chrono::nanoseconds>
        (middle - start).count();

        const auto elapsed_second = std::chrono::duration_cast<std::chrono::nanoseconds>
        (end - middle).count();

        std::cout << "First: " << elapsed_first << std::endl;
        std::cout << "Second: " << elapsed_second << std::endl;
        std::cout << std::endl;
    }

    char c;
    std::cin >> c;
}

I, however, see a linear increase in execution time when I'm executing my code above, ranging from

First: 13440ns Second: 9919ns

for the smallest arrays size to

First: 25919ns Second: 300147450ns

for the largest one.

I'm using VS2019 on Win10 and compiling for x64, if this is from interest.

Jarod42
  • 203,559
  • 14
  • 181
  • 302
SonneXo
  • 177
  • 1
  • 1
  • 10
  • `prom.get_future()` is already returning a pr value, btw. No need to move it. – AndyG Nov 06 '19 at 15:04
  • FWIW `return std::move(cases);` is almost always wrong. Function local objects are already moved in a return statement if they can be so all this does is stop the compiler from being able to use NRVO. Just use `return cases;` and "It'll do the right thing" – NathanOliver Nov 06 '19 at 15:06
  • @NathanOliver-ReinstateMonica Is it? I thought NRVO, as well as other optimizations in this category are all left to the compiler to apply (or not). Besides: What does it hurt? – SonneXo Nov 06 '19 at 15:09
  • I would cut `GenerateTestCases` down to `return std::vector>(number_inputs, std::vector(input_size));` – molbdnilo Nov 06 '19 at 15:10
  • 2
    @SonneXo Using `return std::move(cases);` stops the compiler from being able to use NRVO. Since NRVO is a good thing, you don't want stop it. If you have `return cases;` then you might get NRVO, if you don't you still get a move so it is a win-win to not use `return std::move(cases);` – NathanOliver Nov 06 '19 at 15:11
  • @molbdnilo That's just a minimal working example – SonneXo Nov 06 '19 at 15:12
  • @NathanOliver-ReinstateMonica I'll remember that for future usages, I thought the compiler would have enough freedom to to that nevertheless! – SonneXo Nov 06 '19 at 15:13
  • No worries, it does seem like it should be what you want. For the technical reason why it isn't, see: https://stackoverflow.com/questions/19267408/why-does-stdmove-prevent-rvo – NathanOliver Nov 06 '19 at 15:15
  • You loop through the `test_cases`, I don't see a reason the time wouldn't increase.. – apple apple Nov 06 '19 at 15:17
  • @appleapple Because *test_cases.size()* is always 100. Only the size of the inner vectors increases, which are not looped through. – SonneXo Nov 06 '19 at 15:19
  • oh yes, I just notice that – apple apple Nov 06 '19 at 15:20

1 Answers1

2

The time cost is probably from the destruction of the vector.


I did a test and the time doesn't increase when the results are kept in another vector (i.e. prevent the destruction)

http://coliru.stacked-crooked.com/a/dc7792496a981de3


While in your original code, the time does increase

http://coliru.stacked-crooked.com/a/66b96809fb35b3d2

Justin Bertram
  • 29,372
  • 4
  • 21
  • 43
apple apple
  • 10,292
  • 2
  • 16
  • 36
  • I think my English grammar here is weird, but I cannot get it correct now, can someone help me? I probably need to take a rest. :/ – apple apple Nov 06 '19 at 15:36
  • Wow, I would've never guessed deleting the memory would become more expensive the larger it is, let alone so drastically! Thank you! – SonneXo Nov 06 '19 at 15:37
  • @SonneXo it's also somewhat surprising to me. maybe it happens when run out of memory? – apple apple Nov 07 '19 at 06:09