18

A typical example of a C++ iterator is a pointer, and can be used to point at an element in a C array like so:

int array[] = {1, 2, 3, 4};
int* begin = std::begin(array); //Starting iterator
int* end = std::end(array) //Ending iterator

for(int* i = begin; i < end; i++)
{
    std::cout << *i << ',';
}

//Prints 1, 2, 3, 4

This is simple enough. The definition of an iterator from cplusplus.com is

An iterator is any object that, pointing to some element in a range of elements (such as an array or a container), has the ability to iterate through the elements of that range using a set of operators...

This makes sense; in the above code there were two iterators (the begin and the end iterators), and it used a for loop and incremented.

In Rust, an iterator is used like this:

let vect = vec![1, 2, 3, 4];

let vect_iter = vect.iter();

What? To iterate it you do:

vect_iter.next();
vect_iter.next();

I couldn't find any exact definition of a pointer in the Rust docs, but looking at the Iterator trait, it is seems that an iterator is a wrapper for a container that enables for easier processing, by standardizing the logic in a way (if that makes sense at all).

The main questions I have are:

  1. What are the main differences?
  2. Why does Rust have iterators in this fashion and why are they expressed so differently?
  3. Are there Rust-type iterators in C++?
  4. Are there C++-type iterators in Rust?
  5. Are they called something specific? (Internal/External?)
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
The_Babu
  • 332
  • 3
  • 12
  • 6
    Two different languages, different syntax, etc. – PaulMcKenzie Feb 27 '18 at 01:36
  • 1
    Python's approach is quite similar to Rust's. You wrap up a bit more logic in the iterator, so it knows when it ends without needing to compare it to a sentinel value (the end iterator in C++). C++'s approach can be a bit tedious, since utility functions end up needing separate arguments for the begin and end iterators, rather than simply receiving an iterator that knows its own end point. – ShadowRanger Feb 27 '18 at 01:41
  • The Rust equivalent of the C++ code would be `let v = vec![1, 2, 3, 4]; for i in &v { println!("{},", i); }` Note the lack of calls to `next`. You may wish to further your learning of Rust; these types of questions wash away with increased baseline knowledge. – Shepmaster Feb 27 '18 at 01:43
  • 1
    So are there any of these types of Iterators in C++, because I agree with @ShadowRanger, it is quite tedious, and I think Rust's approach is better... but I think that is not the case sense a lot of functions need to separately have the begin and ending iterators... Also I've researched a bit and seen that these may be called Internal and External Iterators... but is that entirely true? – The_Babu Feb 27 '18 at 01:52
  • 2
    C++11 added range-for: http://en.cppreference.com/w/cpp/language/range-for which at least moves in the right direction. – loganfsmyth Feb 27 '18 at 01:57
  • @Hacksaurus_Babu: the Rust vector example you showed is not unlike how most C++ containers operator. They typically use classes for their iterators instead of raw pointers, and those classes implement `operator++` similar to Rust's `next()`. Your `std::begin()` example is different because it uses a fixed array, so it uses raw pointers for its iterators. Try using `std::array` instead of `int[]`, and you might see something different, depending on STL implementation – Remy Lebeau Feb 27 '18 at 02:44
  • Instead of `for(int* i = begin; i < end; i++)` you should use the iterators directly: `for(auto it = std::begin(array); it != std::end(array); ++it)` – Tim Diekmann Feb 27 '18 at 03:53
  • 1. and 2: In Rust, an amateur newbie like me can create a iterator for my own custom data type after an hour or two of studying, and once it compiles, it usually "just works". I then can use dozens of different iterator adaptors (filter,map,fold) that are built into the language, thanks to its heritage from Functional-ish and immutable-ish languages like lisp and haskell. In C++ iterators are massive header templates that, when used strictly, keep you from writing terrible for(){} blocks and crashing all the time and accidentally modifying your index while you are in the middle of a loop. – don bright Jan 20 '19 at 06:04

3 Answers3

22

An iterator is a concept found in programming languages to refer to a construct which allows iterating over collections or sequences of elements. The concept is purposely vague, it's a concept! It does not prescribe any specific implementation.

To more easily distinguish C++ from Rust, I am going to use different names:

  • C++ iterators will be named cursors,
  • Rust iterators will be named streams.

Yes, those are completely arbitrary. Note that if you look at languages like Java or C#, you will find that they also use streams.


C++

First of all, don't use cplusplus.com. cppreference.com is much better.

An iterator is any object that, pointing to some element in a range of elements (such as an array or a container), has the ability to iterate through the elements of that range using a set of operators...

Simple, and wrong.

A cursor may either:

  • point to an element,
  • or be singular and point to no element at all.

In general, the singular value is used to represent:

  • the end of the sequence to iterate over: vec.end(),
  • the absence of an element: std::find(...).

You can increment, and sometimes decrement, a cursor. If you do so, you however generally need a pair of cursors to know when to stop.

Why did C++ use such a representation? Because that's how C did it, and it works pretty well... although it's error prone.


Rust

Rust endeavors to be safe and favors APIs which are easy to use. This rules out a pair of cursors:

  • a pair of cursors is not safe: you can easily iterate out of bounds, and you can obtain aliasing references,
  • a pair of cursors is error-prone: it's easy to accidentally pair off cursors from two different sequences.

In order to control bounds, aliasing and avoid pair mismatch, you have to use a single object; thus the stream-like API.

The Iterator API in Rust is reminiscent of that of Java and C#, although Rust improves upon it by using Option<T> so that instead of the clumsy hasNext()/next() call pair, it offers a single method next() which both advances the stream and may signal its end.


Conclusion

Both Rust and C++ features a way to iterate over a collection of elements:

  • C++ offers a C-like way, flexible but error-prone,
  • Rust offers a modern way, safe but less flexible.

Both languages also offer external and internal iteration:

  • External: the user controls the iteration (calls ++ or next()),
  • Internal: the iterator controls the user code (see std::foreach and Iterator::foreach).
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
2

Iterators in Rust and C++ are conceptually quite different.

C++

In C++, an iterator is like a pointer. Iterators refer to an object, they can be incremented to refer to the next object, and they can be compared for equality with other iterators. Iterators can also refer to no object at all—they can refer to the “one past the end” element of a sequence, or they can be “singular” (which is like a null pointer). Some iterators support additional operations like moving both forwards and backwards, random access, and being copied.

A pointer in C++ is a valid iterator, but there are also other types which are iterators.

Iterators do not represent a sequence of elements, at least, that is not the convention. In C++, if you want a sequence of elements, you need a pair of iterators*: one for the beginning and one for the end. You aren't forced to iterate over the elements in sequence, you can do all sorts of other things. For example, if you want to reverse an array in C++, you can do it with iterators:

#include <algorithm>
#include <iterator>
#include <cstdio>
#include <utility>

template <typename T, std::size_t N>
void reverse_array(T (&arr)[N]) {
    using std::swap;
    auto left = std::begin(arr), right = std::end(arr);
    while (left < right) {
        --right;
        swap(*left, *right);
        ++left;
    }
}

int main() {
    int x[] = {1, 2, 3, 4, 5};
    reverse_array(x);
    for (const auto it : x) {
        std::printf("%d\n", it);
    }
    return 0;
}

But you can quickly generalize it to work on any container with bidirectional iterators:

#include <algorithm>
#include <iterator>
#include <list>
#include <cstdio>
#include <utility>

template <typename Iterator>
void reverse_any(Iterator left, Iterator right) {
    using std::swap;
    while (left != right) {
        --right;
        if (left == right)
            break;
        swap(*left, *right);
        ++left;
    }
}

int main() {
    std::list<int> list{1, 2, 3, 4, 5};
    reverse_any(std::begin(list), std::end(list));
    for (const auto it : list) {
        std::printf("%d\n", it);
    }
    return 0;
}

Rust

In Rust, an iterator is like a slice. Iterators refer to a sequence of objects, and elements can be accessed from the iterator using the next() method. In a sense, this means an iterator in Rust has both the begin and end iterator inside it. Reimplementing the C++ code above in Rust, you'll get something like this:

fn reverse_any<'a, T: 'a, Iter>(mut iter: Iter)
where
    Iter: DoubleEndedIterator<Item = &'a mut T>,
{
    while let Some(left) = iter.next() {
        if let Some(right) = iter.next_back() {
            std::mem::swap(left, right);
        }
    }
}

fn main() {
    let mut v = [1, 2, 3, 4, 5];
    reverse_any(v.iter_mut());
    println!("{:?}", v);
}

This has the added benefit of safety. Iterator invalidation is one of the most common sources of errors in C++ programs, but Rust eliminates the problem completely.

The cost is that if you want to mutate the elements, you are limited to a single (possibly double-ended) iterator in Rust, while in C++, you can have as many iterators as you want working with the same container. While single-ended and double-ended ranges are the most common case for iterators, there are some algorithms that use the additional flexibility that C++ provides.

One simple example I can think of is C++'s std::remove_if. A straightforward implementation of remove_if would use three iterators: two iterators for keeping track of the range of elements being scanned, and a third iterator for tracking the elements being written. You could translate std::remove_if to Rust, but it would not be able to operate on normal Rust iterators and still modify the container in-place.

Another simple example is the Dutch National Flag problem, which commonly uses three iterators. The solution to this problem is often used for partitioning elements for quicksort, so it's an important problem.

Summary

A Rust iterator is almost equivalent to a start + end C++ iterator pair. C++ allows you to use multiple iterators and move iterators forwards and backwards. Rust guarantees that you don't accidentally use an invalid iterator, but you can only use one at a time and it can only move in one direction.

I don't know of any terminology for distinguishing these types of iterators. Note that Rust-style iterators are much more common, iterators in C#, Python, Java, etc. work the same way but they might have slightly different names (they are called "enumerators" in C#).

Footnotes

*: Technically this is not true. You only need to have one iterator in C++, however, it is conventional to have a pair and library functions typically operate on pairs of iterators (so you "need" two iterators if you want to use those functions). The fact that you have a (start,end) pair does not mean that sequences are bounded, the end iterator can be infinitely far away. Think of it like having a range (0,∞) in mathematics... ∞ isn't really a number, it's just a placeholder that lets you know that the range is unbounded on the right.

: Remember that just because the “end” iterator exists in C++ does not mean that the sequence actually has an end. Some end iterators in C++ are like infinity. They don’t point to valid elements, and no matter how many times you iterate forward, you won’t reach infinity. In Rust, the equivalent construction is an iterator that never returns None.

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • "A Rust iterator is almost equivalent to a start + end C++ iterator pair." this would assume, that every iterator impl [ExactSizeIterator](https://doc.rust-lang.org/std/iter/trait.ExactSizeIterator.html), which is not the case at all – Tim Diekmann Feb 27 '18 at 04:21
  • @Tim: ExactSizeIterator in Rust more or less corresponds to ForwardIterator in C++. Just like not all iterators in Rust are ExactSizeIterator, not all iterators in C++ are ForwardIterator.\ – Dietrich Epp Feb 27 '18 at 04:27
  • I just wanted to clearify, that not all iterators in rust have something like `end`, since they are lazy. – Tim Diekmann Feb 27 '18 at 04:29
  • 3
    @Tim: The exact same thing applies to C++. Just because `end` is assigned a value does not mean you can access elements through `end`. For a simple concrete example, see `std::forward_list`, there is literally nothing you can do with the end iterator except compare it against other iterators. Another example is `std::istream_iterator`, which literally just reads values from a stream until end of stream. The `end` may have a value, but it doesn't give you any additional functionality. – Dietrich Epp Feb 27 '18 at 04:34
  • in rust you don't necessarily **have** an `end`. Iterators in rust can be unbound. – Tim Diekmann Feb 27 '18 at 04:36
  • 3
    @Tim: Again, this is also true in C++. The only requirement is that you can compare your `start` and `end` iterator. There is no requirement that they will ever compare as equal. – Dietrich Epp Feb 27 '18 at 04:37
  • 2
    @Tim: The documentation for the basic Iterator concept can be found here: http://en.cppreference.com/w/cpp/concept/Iterator. This document may clarify any questions you have about how C++ iterators work. – Dietrich Epp Feb 27 '18 at 04:40
  • We may need to distinct `iterator` and `iterator concept`. I'm aware of the page, but I'd refer to the [former](http://en.cppreference.com/w/cpp/iterator), which requires [(c)end](http://en.cppreference.com/w/cpp/iterator/end). We may have talked at cross purposes here :) – Tim Diekmann Feb 27 '18 at 04:43
  • 4
    @Tim: Just like in Rust, not all iterators are made from containers, in C++, not all iterators come from `std::begin()` and `std::end()`. Additionally, there is no guarantee that `std::end()` is defined, and if it is defined there is no guarantee that `std::end()` is reachable from `std::begin()`. There is not really a "distinct" `iterator` and `iterator concept`, the word "iterator" refers to an object which conforms to the iterator concept. – Dietrich Epp Feb 27 '18 at 04:49
  • 2
    *The cost is that you are limited to a single (possibly double-ended) iterator in Rust* — [This is **not true**](https://play.rust-lang.org/?gist=06210aeae53391dd5d74f8aa77a59ce6&version=stable). Perhaps you meant something more subtle? – Shepmaster Feb 27 '18 at 13:44
  • 1
    @Shepmaster: Sorry, meant to talk about mutable iterators. – Dietrich Epp Feb 27 '18 at 14:56
-2

I see three things going on here. Let's break it down.

The Idea of an Iterator

When you call C++'s std::begin and Rust's .iter() in your examples, you are receiving two "types of objects" that are conceptually identical: An iterator.

If we forget about the implementation details for a moment, we can see the purpose and usability of an iterator happen to be similar in both languages. We find that both iterators:

  • Are "objects" which can be created from a collection (an "Iterable type")
  • Can be advanced using C++'s std::advance and Rust's .next()
  • Have an "end," determined by C++'s std::end and the output of Rust's .next().

This is a gross oversimplification of course, they are similar and different in many other ways, but this is probably the general overview you're looking for.

The Implementation of an Iterator

Despite sharing common themes, C++ and Rust are very different languages and will naturally implement a single idea differently. Iterators are no exception.

The "why" is too broad to really answer here on Stack Overflow. It's like asking why oranges are orange and bananas are not :)

But you do seem somewhat confused about how to work with Rust's implementation of iterators, given your comment:

I couldn't find any exact definition of a pointer in the Rust docs

Don't think like a C++ programmer right now. Check out The Book if you haven't already and explore the concepts of borrowing and ownership; It is the far more typical way you will work with data, and it is required to understand how Rust's iterators work.

The Syntactical Sugar for Iterators

Both C++ and Rust have "magic" in their for loops that let you work with iterator "types" easily.

Contrary to your question, this is not a concept unique to Rust. In C++ an object can be used with the modern for (item : collection) syntax if it implements special methods, similar to the Iterator trait you pointed out.

Summary

What are the main differences?

Not much conceptually.

Why does Rust have iterators in this fashion and why are they expressed so differently?

It be like it is because it do. They are more similar than you believe.

Are there Rust-type iterators in C++? Are there C++-type iterators in Rust?

They are conceptually identical.

Are they called something specific? (Internal/External?)

There might be some fancy academic terminology for the implementation differences but I'm not aware of it. An iterator is an iterator.

Litty
  • 1,856
  • 1
  • 16
  • 35
  • 5
    Conceptually, C++ iterators are vastly different from iterators in Rust. Iterators in C++ are more like generalized pointers and you can write functions like `std::sort` or `std::find_if`. In Rust, you can't use iterators to represent ranges at all, so many of these techniques from C++ must be rewritten. – Dietrich Epp Feb 27 '18 at 02:49
  • Before the question's edits it seemed the basic mental concept of an iterator was missing; It felt appropriate to shed things down to a basic level. I agree this answer is too oversimplified now. – Litty Feb 27 '18 at 03:21
  • @Litty, Your answer is not too oversimplified, the difference is realy about the language grammar, not about the iterator concept. The implementations of this same concept in these two languages reflect the intent of these two languages: in Rust errors are avoided, in C++ optimal efficiency is accessible. – Oliv Feb 27 '18 at 08:56
  • 2
    @Oliv: Could you give an example where optimal efficiency is possible in C++ but not Rust? – Matthieu M. Feb 27 '18 at 13:58
  • 1
    @DietrichEpp can you expand on what you mean by *in Rust, you can't use iterators to represent ranges at all*? For example, this [is what I would consider iterating over a range of a collection](https://play.rust-lang.org/?gist=7fc8369571a5867a7ef92f14351e712a&version=stable). You can also iterate directly on a range (`for i in 0..10 {}`). – Shepmaster Feb 27 '18 at 14:05
  • @Mathieu M. What about a structure that holds 2 iterator pointing to 2 container having the same size? What would be the size of such a struct in Rust and C++. Rust check that you do not derefence an invalid pointer each time, C++ does not, etc... There is nothing free, if rust ensures no mistake, it descreases freedom so it decreases oportunities to express your idea or to do efficient code. Your comment is almost systemic when one read word like efficiency. Reaction is not a thought, this is someone else thought. So blind acceptance or denial by reaction are both signs of manipulation. – Oliv Feb 27 '18 at 14:45
  • @Shepmaster: You can do that, but an iterator in Rust lacks many of the operations I would consider standard for working with ranges. You can’t compare their positions, test if one range is contained in another, or extend the range. So the iterators may iterate over a range, but they can’t really be used to represent the range itself. C++ supports comparisons between iterators and lets you move some iterators in both directions, which gives you everything you need to represent ranges as iterator pairs. – Dietrich Epp Feb 27 '18 at 15:18