Iterating a vector to second to last element with index vs iterator

Question

When iterating from the beginning of a C++11 std::vector to the second to last element, what would be the preferred style?

std::vector<const char*> argv;
std::string str;

Should this kind of more C++-esque method be used

for (const auto& s: decltype(argv)(argv.begin(), argv.end()-1)) {
    str += std::string(s) + ' ';
}

or should the more traditional way be preferred?

for (size_t i = 0; i < argv.size() - 1; ++i) {
    str += std::string(argv[i]);
}

Neither unless you check that `argv.end()-1` and/or `argv.size() - 1` is valid. — Richard Critten, Jul 06 '18 at 20:28
@RichardCritten Good point, I should have mentioned that `argv.size() >= 2` is guaranteed. — chew socks, Jul 06 '18 at 20:34
@bipll Oh...This is because `size()` is unsigned and `0ULL - 1` is very large? — chew socks, Jul 06 '18 at 20:35
@OlivierSohn I thought this question would help me improve my C++ in general (and it has), but specifically for this case the last element of `argv` is `nulltpr` because this happens `execvp(argv[0], (char* const*)&argv[0])`, — chew socks, Jul 06 '18 at 21:50

Barry · Answer 1 · 2018-07-07T11:26:53.553

14

Please don't write this:

for (const auto& s: decltype(argv)(argv.begin(), argv.end()-1)) {

First and foremost, nobody (including you) will understand this when you look back on it. Secondly, since decltype(argv) is a vector, this is copying a whole bunch of elements ... all simply because you want to avoid iterating one of them? That's very wasteful.

It also has another problem, which is shared by your second option.

This:

for (size_t i = 0; i < argv.size() - 1; ++i) {

is much more subtly problematic, because size() is unsigned. So if argv happens to be empty, argv.size() - 1 is going to be some enormously large number, and you're actually going to access all these invalid elements of the array leading to undefined behavior.

For iterators, if argv.begin() == argv.end(), then you can't get the previous iterator from end() because there is no previous iterator from end(). All of end() - 1, prev(end()), and --end() are undefined behavior already. At that point, we can't even reason about what the loop will do because we don't even have a valid range.

What I would suggest instead is:

template <typename It>
struct iterator_pair {
    It b, e;

    It begin() const { return b; }
    It end() const { return e; }
};

// this doesn't work for types like raw arrays, I leave that as
// an exercise to the reader
template <typename Range>
auto drop_last(Range& r) 
    -> iterator_pair<decltype(r.begin())>
{
    return {r.begin(), r.begin() == r.end() ? r.end() : std::prev(r.end())};
}

Which lets you do:

for (const auto& s : drop_last(argv)) { ... }

This is efficient (avoids extra copies), avoids undefined behaviors (drop_last() always gives a valid range), and it's pretty clear what it does from the name.

edited Jul 07 '18 at 11:26

answered Jul 06 '18 at 20:57

Barry

286,269
29
621
977

"as an exercise to the reader"...are you a professor by any chance? =P – chew socks Jul 06 '18 at 21:03
Also, thank you. This is really instructive/helpful. I've spent a lot of time with Python lately and people always talk about doing things "the pythonic way"; trying to following that ideology (but for C++) is how I came up with the `decltype` copying stuff. – chew socks Jul 06 '18 at 21:06
1

Note that the corresponding function in Haskell is called `init`, although in C++ this name was often used for quite an antipatternial code design. – bipll Jul 06 '18 at 21:24
1

@chewsocks - I think one general rule of thumb for `decltype` is that you should avoid it in general and should most especially avoid it outside of a template. It took me quite awhile to understand that piece of code myself. `decltype` should never be used in a context where it could be easily interpreted as a function call. – Omnifarious Jul 06 '18 at 21:42
@Omnifarious Do you have a specific article that describes this in more detail than would fit in a comment here? I'd like to read more about this, I've actually been trying to use `decltype` *more*. – chew socks Jul 06 '18 at 21:45
@chewsocks - not offhand. I'll see if I can find some. Both auto and decltype are tricky. It's really easy to overuse auto. If the type of the variable isn't obvious from context it basically means you have to read a bunch of code that's far from where auto appears to figure out what's going on. And that's always a situation to avoid. – Omnifarious Jul 06 '18 at 21:51
@chewsocks - Here is a good SO question about `auto` (which has some related concerns to `decltype`). https://stackoverflow.com/questions/6434971/how-much-is-too-much-with-c11-auto-keyword and this answer on Software Engineering: https://softwareengineering.stackexchange.com/questions/180216/does-auto-make-c-code-harder-to-understand – Omnifarious Jul 06 '18 at 23:44
1

@chewsocks - Also, this GotW is good: https://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style-almost-always-auto/ - I must say that a lot of people sort of disagree with me. I still stand by the 'never use `decltype` in a context where it would be easily confused with a function call'. In cases like that, I would recommend a `using` or `typedef` declaration along with `decltype` so you have a local name for the type and use that instead. Also, prefer the `{ ... }` initializer syntax because it's also harder to confuse with a function call. – Omnifarious Jul 07 '18 at 00:00

lubgr · Answer 2 · 2018-07-07T05:13:32.810

6

I find the first option a bit clumsy to read, but as this is rather a matter of personal preference, I propose an alternative approach avoiding a hand-written loop (and under the assumption that argv.size() >= 1), which might be better in the sense that it reduces the likelyhood of typos and indexing bugs.

#include <numeric>

std::string str;

if (!argv.empty())
    str = std::accumulate(argv.begin(), std::prev(argv.end()), str);

edited Jul 07 '18 at 05:13

answered Jul 06 '18 at 20:36

lubgr

37,368
3
66
117

2

Nice use of `accumulate`. – NathanOliver Jul 06 '18 at 20:37
2

`--argv.end()` isn't guaranteed to be valid. `argv.end() - 1` on the other hand, is. – Barry Jul 06 '18 at 20:47
I like this, thanks! Is there a reason to prefer `--argv.end()` over `std::prev(argv.end())` like R Sahu mentioned? – chew socks Jul 06 '18 at 20:47
@Barry can you give me a hint on why decrementing it isn't guaranteed to be valid? – lubgr Jul 06 '18 at 20:48
@chewsocks No, but appearently I also wasn't aware of the potential pitfalls of `--argv.end()` – lubgr Jul 06 '18 at 20:49
@lubgr If `end()` returns a pointer, could the container even be used with `std::accumulate`? – chew socks Jul 06 '18 at 20:56
2

@chewsocks Yes? Pointers are still iterators. – Barry Jul 06 '18 at 20:58
@Barry ... \*boom\*. I never thought of applying iterator functions to raw pointers. – chew socks Jul 06 '18 at 21:10
I love this answer `+1` but I would be tempted to put the `if(!argv.empty())` statement in the code just so it's clear for people who read code better than they read words. :-) – Galik Jul 06 '18 at 21:17
Does this use `acc = acc + element` or `acc += element` for accumulating strings (I feel it is the 1st one)? If it uses the 1st one then it must be doing a lot unnecessary memory allocations. – lakshayg Jul 06 '18 at 23:54
@Barry I do not understand what is the problem with --v.end() if the check is done to check that v is not empty. Can you provide example where this leads to problems? – NoSenseEtAl Jul 08 '18 at 09:20

Deduplicator · Answer 3 · 2018-07-07T11:48:23.910

My suggestion is to include the guideline support library in your project, if not a more general range-library, and then using a gsl::span (waiting for C++20 to get it as std::span is probably a bit long) or the like for accessing the subrange you want.

Also, it might be small, but it's complicated enough to warrant its own function:

template <class T>
constexpr gsl::span<T> drop_last(gsl::span<T> s, gsl::span<T>::index_type n = 1) noexcept
{ return s.subspan(0, std::min(s.size(), n) - n); }

for (auto s : drop_last(argv)) {
    // do things
}

Actually, taking a look at ranges and views for efficiency (less indirection, no copying) and decoupling (callee no longer needs to know the exact container used) is highly recommended.

score 2 · Answer 4 · answered Jul 06 '18 at 20:42

2

Another option is to use std::for_each with a lambda.

std::for_each(argv.begin(), std::prev(argv.end()), [&](const auto& s){ str += s; });

answered Jul 06 '18 at 20:42

NathanOliver

171,901
28
288
402

score 0 · Answer 5 · answered Jul 06 '18 at 21:26

It looks like you are trying to output the elements in a container with a space in between them. Another way to write that is:

const char* space = "";     // no space before the first item
for (const char* s : argv) {
    str += space;
    str += s;
    space = " ";
}

score 0 · Answer 6 · answered Jul 08 '18 at 09:24

To solve your specific example you could use Google Abseil.

Code example from str_join.h header:

   std::vector<std::string> v = {"foo", "bar", "baz"};
   std::string s = absl::StrJoin(v, "-");
   EXPECT_EQ("foo-bar-baz", s);

EXPECT_EQ is Google Test macro that means that s is equal to "foo-bar-baz"

score -1 · Answer 7 · answered Jul 08 '20 at 15:52

I'm going to play devil's advocate and say just cast argv.size() to int and use a regular for loop. This is going to work in probably 99% of cases, is easy to read, and doesn't require esoteric knowledge of C++ to understand.

for(int i=0; i<(int)argv.size() - 1; i++)

In the off-chance your container has over 2 billion elements, you will probably know this ahead of time, in which case, just use size_t and use a special case to check for argv.size() == 0 to prevent underflow.

R Sahu · Answer 8 · 2018-07-06T21:53:50.927

-2

I would recommend using std::prev(v.end()) instead of v.end()-1. That is more idiomatic and works for other containers too.

Here's a main function that demonstrates the idea.

int main()
{
   std::set<int> s = {1, 2, 3, 4};
   for (const auto& item: decltype(s)(s.begin(), std::prev(s.end())))
   {
      std::cout << item << std::endl;
   }

   std::vector<int> v = {10, 20, 30};
   for (const auto& item: decltype(v)(v.begin(), std::prev(v.end())))
   {
      std::cout << item << std::endl;
   }
}

Please note that the above code constructs temporary containers. Constructing a temporary container would be a performance problem when the container is large. For that case, use a simple for loop.

for (auto iter = argv.begin(); iter != std::prev(argv.end()); ++iter )
{
   str += *iter + ' ';
}

or use std::for_each.

std::for_each(argv.begin(), std::prev(argv.end()),
              [](std::string const& item) { str += item + ' '; });

edited Jul 06 '18 at 21:53

answered Jul 06 '18 at 20:39

R Sahu

204,454
14
159
270

4

Except you don't address the glaring performance problem of creating a copy of a potentially large container. – Omnifarious Jul 06 '18 at 20:44
@Omnifarious Is there a possibility of the compiler optimizing out the intermediate copy - maybe if it was declared const? – chew socks Jul 06 '18 at 20:45
4

@chewsocks - not likely. Too many side effects. Memory allocation and element copies. It would be nearly impossible for the compiler to prove the side-effects inconsequential. – Omnifarious Jul 06 '18 at 20:48
std::prev is C++11? – stephanmg May 20 '20 at 08:08
@stephanmg, yes, it is. https://en.cppreference.com/w/cpp/iterator/prev – R Sahu May 20 '20 at 15:40

Iterating a vector to second to last element with index vs iterator

8 Answers8