23

I have a vector of integers:

std::vector<int> values = {1,2,3,4,5,6,7,8,9,10};

Given that values.size() will always be even.

I simply want to convert the adjacent elements into a pair, like this:

std::vector<std::pair<int,int>> values = { {1,2}, {3,4} , {5,6}, {7,8} ,{9,10} };

I.e., the two adjacent elements are joined into a pair.

What STL algorithm can I use to easily achieve this? Is it possible to achieve this through some standard algorithms?

Of course, I can easily write an old school indexed for loop to achieve that. But I want to know what the simplest solution could look like using rangebased for loops or any other STL algorithm, like std::transform, etc.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
dev-here
  • 239
  • 2
  • 7
  • `std::ranges` should be able to help with this, but that requires C++20. – NathanOliver Apr 01 '22 at 12:46
  • 8
    What if the number of items in `values` is odd? – PaulMcKenzie Apr 01 '22 at 12:47
  • @PaulMcKenzie Good question. The no. of values will always be EVEN in my case. So this is just for even case. Let me addd this to the post – dev-here Apr 01 '22 at 12:51
  • @dev-here -- Maybe what you are looking for is something called a "stride iterator". Plenty of information on them can be found. The stride size in your case would be 2. – PaulMcKenzie Apr 01 '22 at 12:52
  • 7
    What if there is no suitable standard library algorithm? Do you just not want to do it any more in that case? – molbdnilo Apr 01 '22 at 12:52
  • @molbdnilo I can ofcourse solve it using any trivial algorithm. I thought that for a simple task like this there must be some way to use an existing one-liner STL algorithm. So I'm surprised if this is not the case. – dev-here Apr 01 '22 at 12:59
  • 1
    Maybe the closest you can get is to use (or abuse) [std::adjacent_find](https://en.cppreference.com/w/cpp/algorithm/adjacent_find), since that works on adjacent pairs, and just keep returning `false` from the predicate to process all pairs. In the predicate, you have a boolean "switch" to see if you want to save that pair to a vector. I could post that as an answer, but want to get feedback, due to the unorthodox way of using `std::adjacent_find`. – PaulMcKenzie Apr 01 '22 at 13:10
  • 2
    @PaulMcKenzie I have to admit that I've hacked `adjacent_find` like that in the past, and then had to leave a long comment explaining what I was really doing, and why I was massacring the STL. Afterwards I truly believe that implementing it yourself with a loop is the lesser evil. – AndyG Apr 01 '22 at 13:12
  • 1
    @AndyG -- Well, we both can afford the downvotes if one of us posts `std::adjacent_find` as an answer :-) – PaulMcKenzie Apr 01 '22 at 13:14
  • 1
    Does this answer your question? [c++ boost range. Array of pairs from array](https://stackoverflow.com/questions/53882570/c-boost-range-array-of-pairs-from-array) – Giovanni Cerretani Apr 01 '22 at 14:06
  • 3
    While this is indeed a duplicate of the older https://stackoverflow.com/questions/53882570 this question has more varied and modern answers than the other, so I'm just closing the older one as a duplicate of this one. – cigien Apr 01 '22 at 14:17
  • @cigien arguably the other question is different in that OP is specifically asking for a view, while here OP is asking for a materialization of that view – AndyG Apr 01 '22 at 14:48
  • @AndyG Hmm, the title says array, which is misleading. But you're right, it can be edited to focus on generating a view, without invalidating answers. I'll edit and reopen it. Thanks for the ping. – cigien Apr 01 '22 at 14:49
  • Perhaps you can express your question in terms of "a solution without an explicit loop (preferably with no more than the standard library)". Is that right? – Pablo H Apr 04 '22 at 15:01

5 Answers5

22

Once we have C++23's extension to <ranges>, you can get most of the way there with std::ranges::views::chunk, although that produces subranges, not pairs.

#include <iostream>
#include <ranges>
#include <vector>

int main()
{
    std::vector<int> values = {1,2,3,4,5,6,7,8,9,10};
    auto chunk_to_pair = [](auto chunk)
    {
        return std::pair(*chunk.begin(), *std::next(chunk.begin()));
    };
    for (auto [first, second] : values | std::ranges::views::chunk(2) | std::ranges::views::transform(chunk_to_pair))
    {
        std::cout << first << second << std::endl;
    }
}

Alternatively, you could achieve a similar result by ziping a pair of strided views

#include <iostream>
#include <ranges>
#include <vector>

int main()
{
    std::vector<int> values = {1,2,3,4,5,6,7,8,9,10};
    auto odds = values | std::ranges::views::drop(0) | std::ranges::views::stride(2);
    auto evens = values | std::ranges::views::drop(1) | std::ranges::views::stride(2);
    for (auto [first, second] : std::ranges::views::zip(odds, evens))
    {
        std::cout << first << second << std::endl;
    }
}

That last one can be generalised to n-tuples

template <size_t N>
struct tuple_chunk_t
{
    template <typename R, size_t... Is>
    auto impl(R && r, std::index_sequence<Is...>)
    {
        using namespace ranges::view;
        return zip(r | drop(Is) | stride(N)...);
    }
    
    template <typename R>
    auto operator()(R && r) const
    {
        return impl(std::forward<R>(r), std::make_index_sequence<N>{});
    }
    
    template <typename R>
    friend auto operator|(R && r, chunk_t)
    {
        return impl(std::forward<R>(r), std::make_index_sequence<N>{});
    }
};

template <size_t N>
constexpr tuple_chunk_t<N> tuple_chunk;
Caleth
  • 52,200
  • 2
  • 44
  • 75
  • 1
    this looks really cool. Is there a c++23 compiler online I can test this ? – dev-here Apr 01 '22 at 21:04
  • @dev-here not that I can find. `stride` is actually range-v3 and I don't know if there is a paper for it in 23 yet. I got distracted trying to generalise the `drop | stride` version with `std::index_sequence` – Caleth Apr 01 '22 at 21:35
  • 3
    using `` instead of ``, e.g. https://coliru.stacked-crooked.com/a/52b0078ead3ddef6 – Caleth Apr 01 '22 at 21:43
15

I'm not sure why you would require a standard algorithm when writing it yourself is roughly 5 lines of code (plus boilerplate):

template<class T>
std::vector<std::pair<T, T>> group_pairs(const std::vector<T>& values)
{
    assert(values.size() % 2 == 0);
    auto output = std::vector<std::pair<T, T>>();
    output.reserve(values.size()/2);
    for(size_t i = 0; i < values.size(); i+=2)
        output.emplace_back(values[i], values[i+1]);
    return output;
}

And call it like so:

std::vector<int> values = {1,2,3,4,5,6,7,8,9,10};
auto result = group_pairs(values)

Live Demo

AndyG
  • 39,700
  • 8
  • 109
  • 143
  • Yeah, sorry, I can always write a for-loop like that with an index. But that is precisely what I am trying to avoid. – dev-here Apr 01 '22 at 13:07
  • 6
    @dev-here: Solutions using standard algorithms will require you to write custom iterators or custom callable classes and will be even more boilerplate. Sometimes it really is easier and more readable to write it yourself. – AndyG Apr 01 '22 at 13:09
  • 2
    @dev-here using a standard algorithm just because you want to use a standard algorithm won't get you far here. Often standard algorithms lead to cleaner code, but not always, and in such cases it is better to go for clean code rather than using a standard algorithm just because in other cases they help. – 463035818_is_not_an_ai Apr 01 '22 at 13:14
  • 3
    @dev-here a hand-written loop is maximum flexibility, using an algorithm is maximum expressiveness (if used right), a clean solution is often somewhere inbetween because expressiveness is lost when you try to squeeze something into an algorithm it wasnt made for (see other answers) – 463035818_is_not_an_ai Apr 01 '22 at 15:59
  • If you used `i < values.size() - 1` as the loop condition, correctness wouldn't depend on the `assert()` on `size%2`. Except then size=0 becomes the special case, because `size_t` is unsigned. Alternatively, you could loop `i < size/2 ; i++` and use `i*2` and `i*2 + 1` to loop zero times if the input size was 1 or 0. (Compilers should still optimize this out to a pointer increment with offsets.) – Peter Cordes Apr 02 '22 at 01:06
  • TL:DR: loop unrolling with arbitrary bounds is non-trivial, and it would be nice if there was a standard algorithm that took care of this for you, so I can see the appeal. (Like `std::adjacent_find` already has to make sure it only checks valid pairs. And I guess the C++23 chunk thing. But anything involving a manual `next` introduce a possible correctness problem that you have to think through.) – Peter Cordes Apr 02 '22 at 01:09
12

I am not aware of a standard algorithm that does what you want directly (though I am not very familiar with C++20 and beyond). You can always write a loop and most loops can be expressed via std::for_each which is a standard algorithm.


As you are accumulating elements in pairs, I would give std::accumulate a try:

#include <vector>
#include <numeric>
#include <iostream>

struct pair_accumulator {
    std::vector<std::pair<int,int>> result;
    int temp = 0;
    bool set = false;
    pair_accumulator& operator+(int x){
        if (set) {
            result.push_back({temp,x});
            set = false;
        } else {
            temp = x;
            set = true;
        }
        return *this;
    }
};

int main() {
    std::vector<int> values = {1,2,3,4,5,6,7,8,9,10};
    auto x = std::accumulate(values.begin(),values.end(),pair_accumulator{}).result;
    for (const auto& e : x) {
        std::cout << e.first << " " << e.second << "\n";
    }
}

Whether this is simpler than writing a plain loop is questionable admittedly.


If possible I would try to not transform the vector. Instead of accessing result[i].first you can as well use values[i*2] and similar for second. If this is not feasible the next option is to populate a std::vector<std::pair<int,int>> from the start so you don't have to do the transformation. For the first, depending on what you need in details, the following might be a start:

#include <vector>
#include <iostream>

struct view_as_pairs {
    std::vector<int>& values;

    struct proxy {
        std::vector<int>::iterator it;
        int& first() { return *it;}
        int& second() { return *(it +1); }
    };
    proxy operator[](size_t index){
        return proxy{values.begin() + index*2};
    }
    size_t size() { return values.size() / 2;}

};

int main() {
    std::vector<int> values = {1,2,3,4,5,6,7,8,9,10};
    view_as_pairs v{values};
    for (size_t i=0; i < v.size(); ++i){
        std::cout << v[i].first() << " " << v[i].second() << "\n";
    }
}

TL;DR: Consider if you can avoid the transformation. If you cannot avoid it, it is probably cleanest to write a loop. Standard algorithms help often but not always.

463035818_is_not_an_ai
  • 109,796
  • 11
  • 89
  • 185
  • Or construct a `std::pair` on the fly via a wrapper like this (like you're suggesting with avoiding making a copy of the whole vector); a `pair` is very cheap to construct, pure primitive values, so you can use that instead of inventing your own `struct proxy`. You can do the same `idx * 2 + 0 or 1`, and the `size/2` rounds down if there are an odd number of elements in the underlying array. So a caller will see a size 0 or 1 array as empty, no pairs. (Oh, but that would only give you read-only access; I guess the point of your design is to produce something that can be modified.) – Peter Cordes Apr 02 '22 at 01:20
11

OK, I hinted in the comments about using std::adjacent_find, so here is how you would do this.

And yes, many (even myself) considers this a hack, where we are using a tool meant for something else to make short work of solving a seemingly unrelated problem:

#include <algorithm>
#include <iostream>
#include <utility>
#include <vector>

int main()
{
   //Test data  
   std::vector<int> v = {1,2,3,4,5,6,7,8,9,10};

   // results 
   std::vector<std::pair<int,int>> result;

   // save flag 
   bool save_it = true;

   // Use std::adjacent_find 
   std::adjacent_find(v.begin(), v.end(), [&](int n1, int n2) 
      { if (save_it) result.push_back({n1,n2}); save_it = !save_it; return false; });
          
   for (auto& pr : result)
       std::cout << pr.first << " " << pr.second << "\n";
}

Output:

1 2
3 4
5 6
7 8
9 10

The way it works is we ignore the second, fourth, sixth, etc. pairs, and only save the first, third, fifth, etc. pairs. That's controlled by a boolean flag variable, save_it.

Note that since we want to process all pairs, the std::adjacent_find predicate always returns false. That's the hackish part of this solution.

PaulMcKenzie
  • 34,698
  • 4
  • 24
  • 45
  • 3
    Just make sure that if you use this code, put comments in it so that others aren't confused as to why `std::adjacent_find` is used in a non-obvious manner. We're not really finding anything, just ripping-off the pair-wise traversal of adjacent_find. – PaulMcKenzie Apr 01 '22 at 17:04
  • 1
    Note that it is implementation defined behavior in what order the predicate of `std::adjacent_find()` is called. The reason why it is not defined are probably the parallel versions. But of course I doubt that any sensible implementation would use any other but the obvious order for the serial versions. – Sedenion Apr 01 '22 at 18:17
  • @PaulMcKenzie I guess your answer and the answer below with std::accumulate() are similar in the way they exploit traversal? – dev-here Apr 01 '22 at 20:29
  • Is returning always 'false' from a predicate, is it misleading? or is it a common hack? – dev-here Apr 01 '22 at 20:35
  • 1
    @dev-here -- Basically, the `return false;` is like saying "don't terminate until the end". But if you read the docs for `std::adjacent_find`, the predicate suggests that the user of `std::adjacent_find` is actually attempting to find something, i.e. return `true` if found. We're not actually trying to find anything -- just traversing and using the pair of numbers given to the predicate (and with the knowledge that the pairs are adjacent). You could wrap `std::adjacent_find` into your own function that traverses pairs without halting -- call it `for_each_pair` or something like that. – PaulMcKenzie Apr 01 '22 at 20:51
  • 2
    @dev-here As to the hack being common, it's counter-intuitive to the mind, so programmers don't use it too often. Once explained, it becomes clear as to why adjacent_find was used, but it still runs counter to what programmers expect adjacent_find to be used for. But given all that, it works (given the caveats mentioned). – PaulMcKenzie Apr 02 '22 at 00:37
  • @PaulMcKenzie so not really production friendly. On the other hand, I must admit I really learned the inner workings of *find() methods better today, after seeing your example : ) – dev-here Apr 02 '22 at 01:10
  • @PaulMcKenzie the Complexity section of that page implies it must be a linear search from `first` to `last` – Caleth Apr 02 '22 at 14:15
3

The solutions so far try to use the std::vector iterators as input to the algorithms directly. How about defining a custom iterator that returns a std::pair and has strides of 2? Creating the vector of pairs is then a one-liner that uses std::copy. The iterator effectively provides a "view" onto the original vector in terms of pairs. This also allows the use of many of the standard algorithms. The following example could also be generalized quite a bit to work with most container iterators, i.e. you do the difficult work of defining such an iterator once and then you can apply it to all sorts of containers and algorithms. Live example: https://godbolt.org/z/ceEsvKhzd

#include <vector>
#include <algorithm>
#include <iostream>
#include <cassert>

struct pair_iterator {
    using difference_type = std::vector<int>::const_iterator::difference_type;
    using value_type = std::pair<int, int>;
    using pointer = value_type*;
    using reference = value_type; // Not a pair&, but that is ok for LegacyIterator
    // Can't be forward_iterator_tag because "reference" is not a pair&
    using iterator_category = std::input_iterator_tag;

    reference operator*()const { return {*base_iter, *(base_iter + 1)}; }
    pair_iterator & operator++() { base_iter += 2; return *this; }
    pair_iterator operator++(int) { auto ret = *this; ++(*this); return ret; }

    friend bool operator==(pair_iterator lhs, pair_iterator rhs){
        return lhs.base_iter == rhs.base_iter;
    }

    friend bool operator!=(pair_iterator lhs, pair_iterator rhs){
        return lhs.base_iter != rhs.base_iter;
    }

    std::vector<int>::const_iterator base_iter{};
};

auto pair_begin(std::vector<int> const & v){ assert(v.size()%2==0); return pair_iterator{v.begin()}; }
auto pair_end(std::vector<int> const & v){ assert(v.size()%2==0); return pair_iterator{v.end()}; }

int main()
{
    std::vector<int> values = {1,2,3,4,5,6,7,8,9,10};
    std::vector<std::pair<int, int>> pair_values;

    std::copy(pair_begin(values), pair_end(values), std::back_inserter(pair_values));

    for (auto const & pair : pair_values) {
        std::cout << "{" << pair.first << "," << pair.second << "} ";
    }
    std::cout << std::endl;
}
Sedenion
  • 5,421
  • 2
  • 14
  • 42