2

Suppose I have a vector (potentially large) of trivial types, e.g.

std::vector<int> v{1,2,3};

What is the best way to repeat it n times?

E.g. 3 times would give the result {1,2,3,1,2,3,1,2,3}

Surely, this is a problem that arises often (numeric libraries usually have such a function build in). My naive solution:

template<typename T>
std::vector<T> repeat(const std::vector<T> &input, unsigned int times) {
    std::vector<T> result;
    auto input_size = input.size();
    result.reserve(input_size * times);
    for (std::size_t rep = 0; rep < times; ++rep) {
        for (std::size_t i = 0; i < input_size; ++i) {
            result.push_back(input[i % input_size]);
        }
    }
    return result;
}

Surely this can be made faster? Maybe with std::copy? In that case, however, I'm not sure how to tell the vector its new size while avoiding zero-initialization. Apparently it cannot be done easily (see, e.g., 1). I tried to do it with iterators as well, but it didn't seem to be faster.

cigien
  • 57,834
  • 11
  • 73
  • 112
Adomas Baliuka
  • 1,384
  • 2
  • 14
  • 29
  • 1
    Faster than what? For a performance question you need to show the setup you're using, and the performance of your solution on some benchmarks. – cigien Nov 20 '20 at 17:13
  • Your edit with the profiling results is very nice. But you can't add that to the question. Go ahead and post it as an answer. – cigien Nov 23 '20 at 18:06
  • @cigien I saw the profiling you deleted as part of the question, which is not completely answered yet (there are some minor variations, some of them in the links I provided, that I haven't tried and nobody in this thread said anything about for comparing them to the ones discussed). Since you already deleted the section, I will post it as an answer. However, I'm not at all aware that this is the correct way and that "I can't add that to the question" according to site policy. – Adomas Baliuka Nov 23 '20 at 23:10
  • Yes, the question box is only for questions. You can edit it to clarify your question. But your profiling results on the various solutions are not part of the question. They are an answer, and should be written in the answer box. – cigien Nov 23 '20 at 23:18
  • Can you try `memcpy` instead of `push_back`? – binhgreat Nov 24 '20 at 08:12
  • Well, for one, `i % input_size` is equal to `i`, provided `i < input_size`. You didn't finish that thought in your code. Concerning your question "Maybe with std::copy? ", why don't you try it? – Ulrich Eckhardt Nov 28 '20 at 12:46

3 Answers3

3

I would just immediately size the vector to avoid any intermediate re-allocations. Then you can use std::copy with some arithmetic to copy the input vector into result using specific offsets into that pre-allocated vector.

template<typename T>
std::vector<T> repeat(const std::vector<T> &input, unsigned int times) {
    std::vector<T> result(input.size() * times);
    for (std::size_t rep = 0; rep < times; ++rep) {
        std::copy(input.begin(), input.end(), std::next(result.begin(), rep * input.size()));
    }
    return result;
}
Cory Kramer
  • 114,268
  • 16
  • 167
  • 218
  • I would take it a step further by calling `result.begin()` 1 time and saving the iterator to a local variable that is incremented on each loop iteration: `std::vector result(input.size() * times); auto iter = result.begin(); for (std::size_t rep = 0; rep < times; ++rep, iter += input.size()) { std::copy(input.begin(), input.end(), iter); }` – Remy Lebeau Nov 20 '20 at 18:36
  • That would be OK for `int` (or other trivially constructed types). Since you went through the troubles of making it a template, I think it would be better to declare an empty vector and use `reserve()` – Vlad Feinstein Nov 20 '20 at 20:40
  • @RemyLebeau I profiled your version and it doesn't seem to be any faster (see profiling code that I added to my question). – Adomas Baliuka Nov 23 '20 at 08:26
  • @VladFeinstein so you mean something like the code provided in my question? Note: For trivial types that version is significantly slower. – Adomas Baliuka Nov 23 '20 at 08:27
  • @AdomasBaliuka - I meant the code in *this* answer by Cory Kramer, only instead of `std::vector result(input.size() * times);` do `std::vector result; result.reserve(input.size() * times);` However, once again, there is no difference for `int` type – Vlad Feinstein Nov 23 '20 at 17:25
3

With range-v3 you could write:

#include <range/v3/all.hpp>
namespace rv = ranges::views;

template<typename T>
auto repeat(const std::vector<T> &input, unsigned int times) 
{
    return rv::repeat_n(input, times) 
           | rv::join 
           | ranges::to<std::vector<T>>;
}

Here's a demo.

I suspect this will have sufficiently good performance for your needs.

cigien
  • 57,834
  • 11
  • 73
  • 112
1

This was originally an edit to the question. The user cigien suggested to post it as an answer. This contains some incomplete (i.e., not yet exploring all possibilities to implement the solution) profiling results.

I made some profiling code (I'm by no means an expert on profiling) comparing my version and the one in the answer by Cory Kramer. The code in the answer appears to be 4.5 times faster than mine (tested at quick-bench.com with GCCv10.1, C++17, O3 optimization). The suggestion by Remy Lebeau to save a temporary of the iterator does not appear to make any difference.

There were duplicate questions of this that I missed before asking myself: 1 and 2. Some of the answers in these suggest even more slightly different solutions which I haven't profiled yet.

The range-v3 library (answer by cigien), while it looks very convenient, is not an option for me to use and I also didn't profile it.

Profiling code:

// This code is intended to be used at quick-bench.com.
// Needs profiling library AND ADDITIONAL INCLUDES to compile, 
// see https://github.com/google/benchmark
    
#include<vector>
    
template<typename T>
std::vector<T> repeat_1(const std::vector<T> &input, unsigned int times) {
    std::vector<T> result;
    auto input_size = input.size();
    result.reserve(input_size * times);
    for (std::size_t rep = 0; rep < times; ++rep) {
        for (std::size_t i = 0; i < input_size; ++i) {
            result.push_back(input[i % input_size]);
        }
    }
    return result;
}
    
template<typename T>
std::vector<T> repeat_2(const std::vector<T> &input, unsigned int times) {
    std::vector<T> result(input.size() * times);
    for (std::size_t rep = 0; rep < times; ++rep) {
        std::copy(input.begin(), input.end(),
                  std::next(result.begin(), rep * input.size()));
    }
    return result;
}
    
template<typename T>
std::vector<T> repeat_3(const std::vector<T> &input, unsigned int times) {
    std::vector<T> result(input.size() * times);
    auto iter = result.begin();
    for (std::size_t rep = 0; rep < times; ++rep, iter += input.size()) {
        std::copy(input.begin(), input.end(), iter);
    }
    return result;
}


static void version_1(benchmark::State &state) {
    std::vector<int> vec = {12, 4, 4, 5, 16, 6, 6, 17, 77, 8, 54};
    for (int i = 0; i < 100'000; ++i) {
        vec.push_back(i % 10'000);
    }

    for (auto _ : state) {
        auto repeated = repeat_1(vec, 1000);
        // Make sure the variable is not optimized away by compiler
        benchmark::DoNotOptimize(repeated);
    }
}
BENCHMARK(version_1);
    
static void version_2(benchmark::State &state) {
    std::vector<int> vec = {12, 4, 4, 5, 16, 6, 6, 17, 77, 8, 54};
    for (int i = 0; i < 100'000; ++i) {
        vec.push_back(i % 10'000);
    }

    for (auto _ : state) {
        auto repeated = repeat_2(vec, 1000);
        // Make sure the variable is not optimized away by compiler
        benchmark::DoNotOptimize(repeated);
    }
}
BENCHMARK(version_2);
    
static void version_3(benchmark::State &state) {
    std::vector<int> vec = {12, 4, 4, 5, 16, 6, 6, 17, 77, 8, 54};
    for (int i = 0; i < 100'000; ++i) {
        vec.push_back(i % 10'000);
    }

    for (auto _ : state) {
        auto repeated = repeat_3(vec, 1000);
        // Make sure the variable is not optimized away by compiler
        benchmark::DoNotOptimize(repeated);
    }
}
BENCHMARK(version_3);
cigien
  • 57,834
  • 11
  • 73
  • 112
Adomas Baliuka
  • 1,384
  • 2
  • 14
  • 29
  • This is nice, thanks for posting an answer. Small point, it would be nice to link to the *actual* quick-bench test you ran. When you run the benchmark, you'll get a unique URL that you can paste here. – cigien Nov 23 '20 at 23:21
  • @cigien Updated the URL. Assuming you have Google Benchmark, perhaps you would like to profile your version using the range-v3 library for comparison? – Adomas Baliuka Nov 24 '20 at 10:20
  • Thanks for adding the link. I've added the range-v3 version [here](https://quick-bench.com/q/AULNvhoyIg759xVF04PJ5rMyaC4) so you can change the link in your answer. Note that quick-bench appears to have an older version of range-v3, so the namespace is `view` instead of `views`, and it's `to_` instead of `to`. So make sure to add the source code of that particular function to your answer as well. – cigien Nov 24 '20 at 17:27
  • @cigien I included your code. When I get around to it, I will still profile `insert` and `copy` with `back_inserter`. Then hopefully I can accept the fastest answer here. – Adomas Baliuka Nov 28 '20 at 12:07
  • Oops, my benchmark has a typo: `version_4` is actually calling `repeat_3`. Also, my solution was wrong. I've fixed my solution and added a working demo, but I can't get it to work on quick-bench with the old version of range-v3 it's using. You can just "rollback" your answer to the last version, since that is still valid. Sorry about the goofup. – cigien Nov 28 '20 at 14:38
  • 1
    I've rolled-back myself, since the answer was incorrect with my buggy benchmark. – cigien Nov 28 '20 at 16:05