Copying from one dimensional vector vector starts to first element of two dimensional vector pair vector>matrix

Question

I have multiple 3 one dimensional vectors (vector<int> starts, vector<int> ends, vector<int> points). Each having specific number of elements.

I want to create a two dimensional vector vector<pair<int,int>>matrix in such a sequence :

from beginning of matrix to size of start first element of matrix is elements of vector<int> starts and second element is "-1"
Append now the elements of vector<int> ends to matrix such that first element of matrix is elements of vector<int> ends and second element is "-2"
Append now the elements of vector<int> points to matrix such that first element of matrix is elements of vector<int> points and second element is Index of points.

Visual Representation :-

Input:

starts: {1, 2, 3} ends: {4, 5, 6} points: (7, 8, 9}

Output:

matrix: { {1, -1}, {2, -1}, {3, -1}, {4, -2}, {5, -2}, {6, -2}, {7, 0}, {8, 1}, {9, 2} }

Currently I am using a push_back with for-loop function which works perfectly fine but when the input size is big code is very slow.

Code I am using is as follows:

vector<pair<int,int>> fast_count_segments(
    vector<int> starts, 
    vector<int> ends, 
    vector<int> points) 
{
   int i = 0;
   vector<pair<int,int>>matrix;

   for(i; i<starts.size(); i++) {
       matrix.push_back(make_pair(starts[i],-1));
   }
   for(i; i<starts.size()+ends.size(); i++) {
       matrix.push_back(make_pair(ends[i-starts.size()],-2));
   }
   for(i; i<starts.size()+ends.size()+points.size(); i++) {
        matrix.push_back(make_pair(
            points[i-starts.size()-ends.size()],
            i-(starts.size()+ends.size())
        ));
   }
   return matrix;
}

Can you please help on how to fill the 2D vector quickly with these requirements without iterating through each element. I am using C++11. Thanks in Advance !!

Don't tag a C++ question with the C tag. It won't help you make friends and influence people. — Jonathan Leffler, Mar 27 '20 at 20:43
`vector>` … uh, that's not how it works. That would create a 1D vector of pairs of int. Also https://www.boost.org/doc/libs/1_72_0/libs/multi_array/doc/user.html — datenwolf, Mar 27 '20 at 20:53
@datenwolf: That could be a sparse representation of a matrix. — einpoklum, Mar 27 '20 at 20:54
@epinpoklum Only 2-by-N matrices or N-by-2 matrices. Maybe you want `vector>`? — Zuodian Hu, Mar 27 '20 at 20:59
@einpoklum: Only if the elements of the sparse matrix you want to represent are members of the set `{False | True}`. Or if you want to represent a bidiagonal matrix. Either way, it'd be a very inefficient way to go about this. — datenwolf, Mar 27 '20 at 21:04
@datenwolf: Yes, that would be a boolean matrix. Ok, point taken, I now also share your worry OP might be doing something wrong here. — einpoklum, Mar 27 '20 at 21:06

einpoklum · Answer 1 · 2020-03-27T21:55:50.043

Preliminary concern: As @datenwolf and others note - Your resulting data structure is not a 2D matrix (unless you mean a boolean matrix in sparse representation). Are you sure that's what you want to be populating?

Regardless, here are a few ideas to possibly improve speed:

Don't take the input vectors by value! That's useless copying... take their .data(), or their .cbegin() iterator, or take a span<int> parameter.
Use the reserve() method on the target vector to avoid multiple re-allocations.
Use .emplace_back() instead of .push_back() to construct the points in place, rather than constructing-then-moving every point. Although, to be honest, the compiler will probably optimize those constructions away, anyway.
Put the .size() values of the input vectors in local variables. This will only help if, for some reason, the compiler suspects that size will not be constant throughout the execution of the function.
Make sure you're passing optimization switches to the compiler (e.g. -O2 or -O3 to GCC and clang). This might seem obvious to you but sometimes it's so obvious you forget to check it's actually been done.

Some aesthetic comments:

No need to use the same counter for all vectors. for(int i = 0; i < whatever; i++) can be used multiple times.
No need for raw for loops, you can use for(const auto& my_element : my_vector) for the first two loops. The third loop is trickier, since you want the index. You can use std::difference() working with iterators, or go with Python-style enumeration described here.
You might consider using std::transform() with a back_emplacer output iterators instead of all three loops. No-loop code! That would mean using std::difference() in the transformer lambda instead of the third loop.

For 1. wouldn't a reference be the most natural option? 3. counts as aethetic, with integer values the generated code will be the same. — Marc Glisse, Mar 27 '20 at 21:19
@MarcGlisse: 1. Nope, we work with iterators, ranges, or in some cases spans. Unless we're operating on the input vector _as_ a vector, which we aren.t 3. You're probably right. — einpoklum, Mar 27 '20 at 21:55
A vector is a special case of range, so you can take a T const& (or sized_range const& in C++20) for instance. — Marc Glisse, Mar 28 '20 at 07:26

cigien · Answer 2 · 2020-03-28T20:35:48.743

This incorporates the suggestions from @einpoklum's answer, but also cleans up the code.

std::vector<std::pair<int,int>> fast_count_segments(
    std::vector<int> const & starts, 
    std::vector<int> const & ends, 
    std::vector<int> const & points) 
{
   std::vector<std::pair<int,int>> matrix(starts.size() + ends.size() + points.size());

   auto out = std::transform(starts.cbegin(), starts.cend(),
                             matrix.begin(), 
                             [](int i) { return std::pair<int,int>{i, -1}; });

   out = std::transform(ends.cbegin(), ends.cend(),
                        out, 
                        [](int i) { return std::pair<int,int>{i, -2}; });

   int c = 0;
   std::transform(points.cbegin(), points.cend(),
                  out,
                  [&c](int i) { return std::pair<int,int>{i, c++}; });

   return matrix;
}

You could even write all the transforms as a single expression. Whether this is easier to read is highly subjective, so I'm not recommending it per se. (Try reading it like you would nested function calls.)


std::vector<std::pair<int,int>> fast_count_segments(
    std::vector<int> const & starts, 
    std::vector<int> const & ends, 
    std::vector<int> const & points) 
{
   std::vector<std::pair<int,int>> matrix(starts.size() + ends.size() + points.size());
   int c = 0;

   std::transform(points.cbegin(), points.cend(),
        std::transform(ends.cbegin(), ends.cend(),
            std::transform(starts.cbegin(), starts.cend(),
                matrix.begin(), 
            [](int i) { return std::pair<int,int>{i, -1}; }),
        [](int i) { return std::pair<int,int>{i, -2}; }),
    [&c](int i) { return std::pair<int,int>{i, c++}; });

   return matrix;
}

Doesn't each `transform` return the parameter for the next one? — Marc Glisse, Mar 27 '20 at 21:14
Yeah, thanks for the suggestion. I realize I could nest the transforms as well, though I suspect that wouldn't make it easier to read. — cigien, Mar 27 '20 at 21:21
Thank you @cigien. I am beginner in cpp. don't understand the significance of [] in transform. Also it is giving compilation error "returning initializer list". Also I am using C++11 hence for [c = 0], lambda capture initializers only available with -std+c++14 — ibrahim rupawala, Mar 28 '20 at 19:37
The [] indicates the start of a [lambda expression](https://en.cppreference.com/w/cpp/language/lambda). Updated the answer to work in c++11. — cigien, Mar 28 '20 at 19:40
Thanks a ton @cigien. This really makes it clear. However if I use this method I am not able to read the matrix as I was reading it when I populate the matrix using the for-loop. `cout << matrix[i].first << " " << matrix[i].second << endl;`. Where i iterates through the size of the loop. What might be the reason ? — ibrahim rupawala, Mar 28 '20 at 20:18
That's right. [reserve](https://en.cppreference.com/w/cpp/container/vector/reserve) only changes the capacity, not the size of the vector. I forgot that. I've edited the answer. — cigien, Mar 28 '20 at 20:37

Copying from one dimensional vector vector starts to first element of two dimensional vector pair vector>matrix

2 Answers2