29

Just a little introduction, with simple words. In C++, iterators are "things" on which you can write at least the dereference operator *it, the increment operator ++it, and for more advanced bidirectional iterators, the decrement --it, and last but not least, for random access iterators we need operator index it[] and possibly addition and subtraction.

Such "things" in C++ are objects of types with the according operator overloads, or plain and simple pointers.

std::vector<> is a container class that wraps a continuous array, so pointer as iterator makes sense. On the nets, and in some literature you can find vector.begin() used as a pointer.

The rationale for using a pointer is less overhead, higher performance, especially if an optimizing compiler detects iteration and does its thing (vector instructions and stuff). Using iterators might be harder for the compiler to optimize.

Knowing this, my question is why modern STL implementations, let's say MSVC++ 2013 or libstdc++ in Mingw 4.7, use a special class for vector iterators?

Niall
  • 30,036
  • 10
  • 99
  • 142
dimm
  • 1,792
  • 11
  • 15
  • 14
    The question is: why not? Contrary to what you seem to think, using classes instead of pointers doesn’t imply added overhead, and using classes has other potential benefits. – Konrad Rudolph Sep 18 '15 at 14:11
  • Because iterator is a universal thing. Yes, for vector it is like a pointer, but for example a list_iterator is totally different, when you ++ a list iterator it won't just an sizeof(pointerType) offset from the pointer. Also, algorythms can use iterators because iterators promise that ++ them will give back the next element... a pointer just doesn't do this. – Melkon Sep 18 '15 at 14:12
  • 5
    One reason is safety : libraries have assertions on dereferencing an invalid iterator. – Quentin Sep 18 '15 at 14:12
  • 2
    It turns out that the compilers are smart enough to figure out that the vector iterator class just contains a pointer, and optimize from that. – Bo Persson Sep 18 '15 at 14:13
  • There's no overhead. in practice in the particular implementation, vector iterator is a pointer. There's a lot of overhead involved in bounds checking/making sure you don't deference a null pointer and so on - you can turn that off with some #defines. – Robinson Sep 18 '15 at 14:13
  • Incidently Vectors are guaranteed by the standard to be contiguous in memory, so that is a contractual requirement. But why do I need to keep that in my head when I'm just trying to iterate things? Indeed, why am I being given the richer interface of a pointer when all I need to do is iterate? – Nathan Cooper Sep 18 '15 at 15:17
  • @KonradRudolph: Unfortunately, in the way STL is usually implemented, using classes instead of pointers often adds some overhead. I speak about increased pressure on inlining capabilities of the compiler (which are not ideal). That one of the reasons why in [EASTL](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html) vector's iterator **is** a pointer (just search for first `vector::iterator` occurence). – stgatilov Sep 18 '15 at 16:29
  • 6
    @stgatilov I think that’s outdated knowledge. Yes, the standard library requires aggressive inlining capabilities. But modern compilers deliver this, and then some. Compiler have evolved **a lot** since 2007. – Konrad Rudolph Sep 18 '15 at 16:30
  • it's probably seen as 'kosher' to have the member typedefs ( value_type, difference_type etc ) in the iterator, as if inherited from std::iterator – sp2danny Sep 18 '15 at 16:45
  • 2
    Generic code, in general, is massively more practical with good inlining and comdat folding. A good modern compiler *must* be good at that task to leverage modern C++. Without it, C++ is crippled. However, there exist good modern compilers, and they are in fact relatively common at this point. And they continue to get better. – Yakk - Adam Nevraumont Sep 18 '15 at 18:47
  • 1
    @sp2danny I don't see how it's any less OK to get those typedefs from the relevant instantiation of `std::iterator_traits<>`, rather than members; said template exists to ensure a uniform interface for iterators, regardless of whether they (can) have member typedefs. And `std::iterator` is not a guide to whether something is any good, seeing as it's deprecated since C++17. – underscore_d Nov 17 '18 at 21:42

7 Answers7

27

You're completely correct that vector::iterator could be implemented by a simple pointer (see here) -- in fact the concept of an iterator is based on that of a pointer to an array element. For other containers, such as map, list, or deque, however, a pointer won't work at all. So why is this not done? Here are three reasons why a class implementation is preferrable over a raw pointer.

  1. Implementing an iterator as separate type allows additional functionality (beyond what is required by the standard), for example (added in edit following Quentins comment) the possibility to add assertions when dereferencing an iterator, for example, in debug mode.

  2. overload resolution If the iterator were a pointer T*, it could be passed as valid argument to a function taking T*, while this would not be possible with an iterator type. Thus making std::vector<>::iterator a pointer in fact changes the behaviour of existing code. Consider, for example,

    template<typename It>
    void foo(It begin, It end);
    void foo(const double*a, const double*b, size_t n=0);
    
    std::vector<double> vec;
    foo(vec.begin(), vec.end());    // which foo is called?
    
  3. argument-dependent lookup (ADL; pointed out by juanchopanza) If you make an unqualified call, ADL ensures that functions in namespace std will be searched only if the arguments are types defined in namespace std. So,

    std::vector<double> vec;
    sort(vec.begin(), vec.end());             // calls std::sort
    sort(vec.data(), vec.data()+vec.size());  // fails to compile
    

    std::sort is not found if vector<>::iterator were a mere pointer.

Walter
  • 44,150
  • 20
  • 113
  • 196
  • 6
    Overload resolution is probably the single biggest reason, +1 – Ben Voigt Sep 18 '15 at 14:37
  • 1
    @BenVoigt Another issue is ADL not working for plain pointers. Unless an implementation were allowed to define a custom pointer type inside the `std` namespace. – juanchopanza Sep 18 '15 at 14:44
  • @juanchopanza If I'm nost mistaken, ADL should only be relevant, if `vector::iterator` (rather than some template parameter) is explicitly used as argument by some `std` function. Do you have an example of such a function? – Walter Sep 18 '15 at 14:50
  • @Walter Any function that takes a pair of iterators. `std::vector v; sort(v.begin(), v.end());` doesn't work unless the iterators are in the `std` namespace. – juanchopanza Sep 18 '15 at 14:51
  • @Walter I think juanchopanza means an unqualified call to `sort()`, not a qualified call to `std::sort()`. – Barry Sep 18 '15 at 15:07
  • @Barry Yes, obviously. Qualified wouldn't need ADL. – juanchopanza Sep 18 '15 at 15:16
  • @Barry Okay, now I get it. (I never use unqualified calls to `std` functions). – Walter Sep 18 '15 at 15:18
  • 2
    Overload resolution could be even worse if the debug configuration used checked iterators (a class type) while release configuration does not. – dyp Sep 18 '15 at 17:22
7

The implementation of the iterator is implementation defined, so long as fulfills the requirements of the standard. It could be a pointer for vector, that would work. There are several reasons for not using a pointer;

  • consistency with other containers.
  • debug and error checking support
  • overload resolution, class based iterators allow for overloads to work differentiating them from plain pointers

If all the iterators were pointers, then ++it on a map would not increment it to the next element since the memory is not required to be not-contiguous. Past the contiguous memory of std:::vector most standard containers require "smarter" pointers - hence iterators.

The physical requirement's of the iterator dove-tail very well with the logical requirement that movement between elements it a well defined "idiom" of iterating over them, not just moving to the next memory location.

This was one of the original design requirements and goals of the STL; the orthogonal relationship between the containers, the algorithms and connecting the two through the iterators.

Now that they are classes, you can add a whole host of error checking and sanity checks to debug code (and then remove it for more optimised release code).


Given the positive aspects class based iterators bring, why should or should you not just use pointers for std::vector iterators - consistency. Early implementations of std::vector did indeed use plain pointers, you can use them for vector. Once you have to use classes for the other iterators, given the positives they bring, applying that to vector becomes a good idea.

Niall
  • 30,036
  • 10
  • 99
  • 142
  • 4
    The OP did explicitly ask about `vector::iterator`, not about `map::iterator`. – Walter Sep 18 '15 at 14:26
  • 2
    @Walter. Consistency. Early implementations of `std::vector` did indeed use plain pointers, you can use them for `vector`. Once you have to use classes for the other iterators, given the positives they bring, applying that to `vector` becomes a good idea. – Niall Sep 18 '15 at 14:28
  • 1
    @Walter: got it now? Things that use iterators work with practically every STL container, whereas using pointers works only on vectors/arrays, and isn't faster! – Marcus Müller Sep 18 '15 at 14:30
  • 4
    @MarcusMüller **You didn't get it**. What you say is not answering the question. The question is not *why not using pointers for iterators?*, but *why is `vector::iterator` implemented differently than a pointer, when a pointer would satisfy all the requirements?* (of a [RandomAccessIterator](http://en.cppreference.com/w/cpp/concept/RandomAccessIterator)) – Walter Sep 18 '15 at 14:38
  • @Walter: but *it wouldn't* satisfy all requirements! A bidirectional access iterator *has* to have the traits, which can only be done with a class wrapping your pointer. – Marcus Müller Sep 18 '15 at 14:44
  • 8
    @MarcusMüller No, there are iterator_traits for pointers. It works, and AFAIK a pointer is a valid implementation for a random access iterator (unless that changed in C++11). – juanchopanza Sep 18 '15 at 14:47
3

The rationale for using a pointer is less overhead, higher performance, especially if an optimizing compiler detects iteration and does its thing (vector instructions and stuff). Using iterators might be harder for the compiler to optimize.

It might be, but it isn't. If your implementation is not utter shite, a struct wrapping a pointer will achieve the same speed.

With that in mind, it's simple to see that simple benefits like better diagnostic messages (naming the iterator instead of T*), better overload resolution, ADL, and debug checking make the struct a clear winner over the pointer. The raw pointer has no advantages.

Puppy
  • 144,682
  • 38
  • 256
  • 465
2

The rationale for using a pointer is less overhead, higher performance, especially if an optimizing compiler detects iteration and does its thing (vector instructions and stuff). Using iterators might be harder for the compiler to optimize.

This is the misunderstanding at the heart of the question. A well formed class implementation will have no overhead, and identical performance all because the compiler can optimize away the abstraction and treat the iterator class as just a pointer in the case of std::vector.

That said,

MSVC++ 2013 or libstdc++ in Mingw 4.7, use a special class for vector iterators

because they view that adding a layer of abstraction class iterator to define the concept of iteration over a std::vector is more beneficial than using an ordinary pointer for this purpose.

Abstractions have a different set of costs vs benefits, typically added design complexity (not necessarily related to performance or overhead) in exchange for flexibility, future proofing, hiding implementation details. The above compilers decided this added complexity is an appropriate cost to pay for the benefits of having an abstraction.

YoungJohn
  • 946
  • 12
  • 18
1

Because STL was designed with the idea that you can write something that iterates over an iterator, no matter whether that iterator's just equivalent to a pointer to an element of memory-contiguous arrays (like std::array or std::vector) or something like a linked list, a set of keys, something that gets generated on the fly on access etc.

Also, don't be fooled: In the vector case, dereferencing might (without debug options) just break down to a inlinable pointer dereference, so there wouldn't even be overhead after compilation!

Marcus Müller
  • 34,677
  • 4
  • 53
  • 94
  • 5
    I don't see how this answers the question. You can do all that (*iterate*) equatlly well with a pointer. For many containers, an iterator cannot be a plain pointer, but `std::vector` this is in fact possible. – Walter Sep 18 '15 at 14:24
  • @Walter, exactly: so, you **can't** iterate over a linked list by incrementing a pointer, you'll need some class to offer you that functionality, that's the purpose of the **iterator** concept. There's a lot of algorithms that work on **iterators**, so they work with any container which is indexable by an iterator, not just vectors; so you got one algorithm working on the data set, no matter whether it's a plain C array, a `std::vector`, a STL list, or something that you wrote yourself! And,as I said,in the `std::vector` case, the compiler actually reduces `*it` to nothing but a pointer deref. – Marcus Müller Sep 18 '15 at 14:28
  • 2
    @MarcusMüller That still doesn't answer the question. A pointer works fine as a random access iterator. And implementation can use an pointer fine, the only thing that I can think of that would change is ADL wouldn't work, unless the implementation can define the pointer type inside of the `std` namespace. But AFAIK, there isn't anything that says a pointer *can't* be used. – juanchopanza Sep 18 '15 at 14:40
  • @juanchopanza you mean aside from the consistency argument that Nialll extensively discusses below? – Marcus Müller Sep 18 '15 at 14:42
  • 3
    @MarcusMüller Well, you aren't making that argument here. – juanchopanza Sep 18 '15 at 14:46
  • 1
    @juanchopanza: I'm pretty sure that *no matter whether that iterator's just equivalent to a pointer [...] or something like a linked list, [...]* implies the argument of consistency! – Marcus Müller Sep 18 '15 at 14:47
  • 2
    I'm pretty sure one could make a good argument that it implies that a plain pointer is fine, because it satisfies all the requirements of a forward iterator. Your first paragraph offers no explanation as to why `std::vector::iterator` is not a pointer on most recent, popular implementations. As has already been mentioned, it does not answer the question. – juanchopanza Sep 18 '15 at 21:18
  • Funny enough `std::array::iterator` *is* a pointer in the implementations of STL I have handy. So the argument is not fully explanatory. (see my answer, I think it is simply a historical glitch above all). – alfC Nov 19 '20 at 11:36
  • @alfC since ISO 14882, 2nd Edition (C++03), std::vector's underlying storage *is* memory-contiguous by standard definition, so it really doesn't explain why in C++ of the last ~20 years, the iterator wasn't just a pointer – Marcus Müller Nov 19 '20 at 11:48
  • 1
    @MarcusMüller, I think once people depend on it to be a class it can't be made a built-in again. For example some people might have depended on `std::vector::iterator::difference_type` to be a valid token by the time this defect was fixed. – alfC Nov 19 '20 at 11:51
0

I think the reason is plain and simple: originally std::vector was not required to be implemented over contiguous blocks of memory. So the interface could not just present a pointer.

source: https://stackoverflow.com/a/849190/225186

This was fixed later and std::vector was required to be in contiguous memory, but it was probably too late to make std::vector<T>::iterator a pointer. Maybe some code already depended on iterator to be a class/struct.

Interestingly, I found implementations of std::vector<T>::iterator where this is valid and generated a "null" iterators (just like a null pointer) it = {};.

    std::vector<double>::iterator it = {};
    assert( &*it == nullptr );

Also, std::array<T>::iterator and std::initializer_list<T>::iterator are pointers T* in the implementations I saw.

A plain pointer as std::vector<T>::iterator would be perfectly fine in my opinion, in theory. In practice, being a built-in has observable effects for metaprogramming, (e.g. std::vector<T>::iterator::difference_type wouldn't be valid, yes, one should have used iterator_traits).

Not-being a raw pointer has the (very) marginal advantage of disallowing nullability (it == nullptr) or default conductibility if you are into that. (an argument that doesn't matter for a generic programming point of view.)

At the same time the dedicated class iterators had a steep cost in other metaprogramming aspects, because if ::iterator were a pointer one wouldn't need to have ad hoc methods to detect contiguous memory (see contiguous_iterator_tag in https://en.cppreference.com/w/cpp/iterator/iterator_tags) and generic code over vectors could be directly forwarded to legacy C-functions. For this reason alone I would argue that iterator-not-being-a-pointer was a costly mistake. It just made it hard to interact with C-code (as you need another layer of functions and type detection to safely forward stuff to C).

Having said this, I think we could still make things better by allowing automatic conversions from iterators to pointers and perhaps explicit (?) conversions from pointer to vector::iterators.

alfC
  • 14,261
  • 4
  • 67
  • 118
-1

I got around this pesky obstacle by dereferencing and immediately referencing the iterator again. It looks ridiculous, but it satisfies MSVC...

class Thing {
  . . .
};

void handleThing(Thing* thing) {
  // do stuff
}

vector<Thing> vec;
// put some elements into vec now

for (auto it = vec.begin(); it != vec.end(); ++it)
  // handleThing(it);   // this doesn't work, would have been elegant ..
  handleThing(&*it);    // this DOES work
user1564286
  • 195
  • 1
  • 8
  • 1
    This doesn't answer the question. And it's not a "pesky obstacle"; it's exactly what you should need to do to ensure you actually get a pointer, since an iterator need not be one and very good reasons have already been provided in other (actual) answers for why it shouldn't. Obviously a better pattern would be to take `Thing&` and pass `*it`. Also note that your code won't work in generic situations where the element type might overload `operator&` (not advisable). – underscore_d Nov 17 '18 at 21:47
  • This works and is a reasonable solution. A minor clarification of terms, though: once you dereference (*) the iterator, you are now referring to the object itself. So the address-of (&) is a pointer to the object, but not an iterator. – Jonathan Lidbeck Apr 16 '19 at 16:31