Why is reverse_iterator::base offset?

Question

      +-- v.begin()           +-- v.end()
      |                       |
      v                       v
    +---+---+---+---+---+---+ - +
    | o | o | o | o | o | o | x |
    +---+---+---+---+---+---+ - +

+ - +---+---+---+---+---+---+
| x | o | o | o | o | o | o |
+ - +---+---+---+---+---+---+
  ^                       ^
  |                       |
  +-- v.rend()            +-- v.rbegin()

(ASCII copied, and edited, from this answer, which is actually prompted me to ask the present question.)

I do see the advantage of having that &*rit == &*(rit.base() - 1), because this way I can take the rit.base() for any reverse iterator rit and I'll always get a valid iterator.

But at the same time

I cannot dereference v.rbegin().base(); I have to remember subtracting 1 first, *(v.rbegin().base() - 1),
and I cannot dereference v.rend().base() - 1 exactly because I can't dereference v.rend() .

What if the design was that &*rit == &*rit.base()?

We couldn't call v.rend().base(), true, but that simply corresponds to not being able to dereference v.rend().base() - 1 in the current design;
We wouldn't be able to get v.end() directly from a reverse iterator, not even the closest one, b.rbegin(), but that simply to the - 1 we have to add to rit.base() in the current design to get a forward iterator to the same element as the reverse.

What I mean is that it seems to me that whether the design decision was that &*rit == &*(rit.base() - 1) (as it was) or that &*rit == &*rit.base(), we'd have had the same amount of convenience

rit.base() is always ok in the actual design,
- 1 is not generally needed in the alternative design

and inconvenience

can't dereference all valid rit.base()s in the actual design,
need to +1 v.rbegin() to get v.end() in the alternative design,

just in opposite situations.

So my question is: is there a definitive advantage in making the choice that has indeed been made? Or was it just a flipped coin?

I see that there's no valid iterator before the first element, but that's my point. When reverse iterators were introduced, what would have been wrong with making std::prev(v.begin()) a valid, non-dereferenceable iterator just like v.end()?

In hindsight, I do see one unquestionable advantage.

I didn't see the advantage of v.rbegin().base() == v.end()/v.rend().base() == v.begin() because why would I want to create v.end()/v.begin() from their reverse counterparts?

But if I have two reverse iterators rit1 and rit2 that define the range (rit1, rit2], then taking it1 = rit1.base() and it2 = rit2.base() allows to refer easily to the same range of elements in the opposite direction, [it1, it2).

Long story short: _I should have read §9.4.1 from The C++ Standarl Library - 2nd ed. first.

BTW, I found the offset is quite convenient when you break a sequence by searching separator values from the end: `std::string last_line = { find(str.rbegin(), str.rend(), '\n').base(), str.end() };` works reliably without additional checks. — j6t, Mar 05 '22 at 22:30

score 8 · Answer 1 · edited Mar 06 '22 at 03:13

8

It is not a design decision, it is a necessity.

A reverse iterator is not a magical thing that is somehow able to iterate a range in reverse. It is a facade that builds on top of things that are already present.

When you have a collection with 6 entries (like in your picture), then all you have is exactly 7 usable iterator values, nothing more (6 are dereferencable, one is the end iterator value). That is all that the reverse iterator can build on.

Since there is no one-before-the-beginning iterator (like the second picture makes one think), there is no other option for the reverse iterator than to map rbegin() to end() and rend() to begin(). That is, you have (rbegin() + i).base() == end() - i

That, in turn, necessitates that the dereferencable iterators (the first 6 starting at rbegin()) must actually dereference iterator .base()-1. There is no other way to implement a reverse iterator.

edited Mar 06 '22 at 03:13

Remy Lebeau

555,201
31
458
770

answered Mar 05 '22 at 21:55

j6t

9,150
1
15
35

_Since there is no one-before-the-beginning iterator (like the second picture makes one think)_ I don't understand this. Even the one-after-the-end iterator is a convenience that doesn't correspond to anything real in memory. Nothing is allocated for it. It's just mean to be a sentinel which you can compare any valid-and-dereferenceable iterator against. Why couldn't a reverse iterator define its own sentinel in a similar way? – Enlico Mar 05 '22 at 23:38
4

@Enlico: "*Even the one-after-the-end iterator is a convenience that doesn't correspond to anything real in memory.*" But it does. It may not point to a real object, but it does *exist* in the sense that it is a valid iterator. The C++ standard requires that if you have an array of X elements, that you can get a pointer to the first element, increment it X times, and still have a pointer you can validly decrement. By contrast, the standard does not allow you to decrement the pointer to the first element of an array; such an operation is UB. The iterator model mimics pointer behavior. – Nicol Bolas Mar 06 '22 at 06:45
@NicolBolas, I see your point, but the fact that _the standard does not allow you to decrement the pointer to the first element of an array_ is exactly what prevents this "alternate design" to be possible. But couldn't the standard have accomodated this design by accomodating at the same time that the `.begin()` can be decremented by one? Maybe this wouldn't have been possible because the latter "accomodating ..." would have broken existing code? – Enlico Mar 06 '22 at 06:56
1

@Enlico The standard library's iterators are based on the design of the SGI template library. The SGI template library was not part of the standard and couldn't change anything about it. So it had to make do with what was possible. And that's just for arrays. For linked lists, suddenly the linked list would have to accommodate the "one before begin()" iterator somehow, which means explicit code. And so on - every container would have to support decrementing begin(). That's just not how you design a library. – Sebastian Redl Mar 06 '22 at 13:02

score 2 · Answer 2 · edited Mar 06 '22 at 06:30

In the current design, all of this works as expected:

std::make_reverse_iterator(it).base() == it;
auto rbegin = std::make_reverse_iterator(end);
auto rend   = std::make_reverse_iterator(begin);

In your alternative design, it wouldn't be possible for all of them to hold. If base &*rit == &*rit.base() then

either you must construct with an offset:
```
auto last      = std::prev(end)
auto rbegin    = std::make_reverse_iterator(last);
auto pastbegin = std::prev(begin);         // not allowed!
auto rend      = std::make_reverse_iterator(pastbegin);
```
This would require introducing the concept of iterator to "before first element" which doesn't exist. std::prev(begin) isn't allowed in the current design, and allowing it would make it impossible to have reverse iterators for arrays whose iterators are pointers, because the language prohibits creation of pointers before the first element.
or you must offset after construction
```
auto rpastbegin = std::make_reverse_iterator(end);
auto rbegin     = std::next(rpastbegin);
auto rlast      = std::make_reverse_iterator(begin);
auto rend       = std::next(rlast);
```
This approach requires introduction of past begin only for reverse iterators. Since they aren't pointers, that might be possible in theory, but this would necessitate extra overhead to represent more iterator states than the base iterator has.

Neither alternative is particularly convenient to use.

Why is reverse_iterator::base offset?

2 Answers2