Are there equivalents to slice::chunks/windows for iterators to loop over pairs, triplets etc?

Question

It can be useful to iterate over multiple variables at once, overlapping (slice::windows), or not (slice::chunks).

This only works for slices; is it possible to do this for iterators, using tuples for convenience?

Something like the following could be written:

for (prev, next) in some_iter.windows(2) {
    ...
}

If not, could it be implemented as a trait on existing iterators?

You could easily do an `iter_pairs`, `iter_triples` once you decide what to do if there aren't enough items at the end, but not a generic "any size tuple" one with Rust at the moment. — Chris Emerson, Feb 09 '17 at 11:35
If there aren't enough it would do nothing, as with slice functions. — ideasman42, Feb 09 '17 at 12:01
This was pointed out to me on IRC `https://docs.rs/itertools/*/itertools/trait.Itertools.html#method.tuple_windows` would like to look into its code before posting answer though. — ideasman42, Feb 09 '17 at 12:02

Shepmaster · Answer 1 · 2020-09-20T19:34:57.647

48

It's possible to take chunks of an iterator using Itertools::tuples, up to a 4-tuple:

use itertools::Itertools; // 0.9.0

fn main() {
    let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();

    for (prev, next) in some_iter.tuples() {
        println!("{}--{}", prev, next);
    }
}

(playground)

1--2
3--4
5--6

If you don't know that your iterator exactly fits into the chunks, you can use Tuples::into_buffer to access any leftovers:

use itertools::Itertools; // 0.9.0

fn main() {
    let some_iter = vec![1, 2, 3, 4, 5].into_iter();

    let mut t = some_iter.tuples();
    for (prev, next) in t.by_ref() {
        println!("{}--{}", prev, next);
    }
    for leftover in t.into_buffer() {
        println!("{}", leftover);
    }
}

(playground)

1--2
3--4
5

It's also possible to take up to 4-tuple windows with Itertools::tuple_windows:

use itertools::Itertools; // 0.9.0

fn main() {
    let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();

    for (prev, next) in some_iter.tuple_windows() {
        println!("{}--{}", prev, next);
    }
}

(playground)

1--2
2--3
3--4
4--5
5--6

If you need to get partial chunks / windows, you can get

edited Sep 20 '20 at 19:34

answered Feb 09 '17 at 14:49

Shepmaster

388,571
95
1,107
1,366

Would it work with a tuple of 3 elements? Looking at the doc it seems it could be possible. – Matthieu M. Feb 09 '17 at 15:23
1

@MatthieuM. yep, but the number of implementations *is* limited to a 4-tuple (which I've added). – Shepmaster Feb 09 '17 at 15:31
Yes, well... in the absence of variadics it's otherwise painful to implement I guess (and bulky too). – Matthieu M. Feb 09 '17 at 15:49
Was this removed from the latest version of itertools? I can't see it [here](https://docs.rs/itertools/0.7.6/itertools/). – dshepherd Feb 04 '18 at 18:14
1

@dshepherd I continue to see both methods. I've updated the links to the documentation and provided playground links. – Shepmaster Feb 04 '18 at 18:19
1

Ah, I was looking at the list of free functions rather than the list of functions on the itertools trait. – dshepherd Feb 04 '18 at 18:46
This does not handle the case where the input is not divisible by the chunk length, right? So it is **not** an equivalent of `chunk`. – Bakuriu Sep 20 '20 at 16:01
@Bakuriu you can get the remaining pieces via [`Tuples::into_buffer`](https://docs.rs/itertools/0.9.0/itertools/structs/struct.Tuples.html#method.into_buffer) – Shepmaster Sep 20 '20 at 19:28

score 31 · Answer 2 · edited Feb 09 '17 at 15:15

TL;DR: The best way to have chunks and windows on an arbitrary iterator/collection is to first collect it into a Vec and iterate over that.

The exact syntax requested is impossible in Rust.

The issue is that in Rust, a function's signature is depending on types, not values, and while Dependent Typing exists, there are few languages that implement it (it's hard).

This is why chunks and windows return a sub-slice by the way; the number of elements in a &[T] is not part of the type and therefore can be decided at run-time.

Let's pretend you asked for: for slice in some_iter.windows(2) instead then.

Where would the storage backing this slice live?

It cannot live:

in the original collection because a LinkedList doesn't have a contiguous storage
in the iterator because of the definition of Iterator::Item, there is no lifetime available

So, unfortunately, slices can only be used when the backing storage is a slice.

If dynamic allocations are accepted, then it is possible to use Vec<Iterator::Item> as the Item of the chunking iterator.

struct Chunks<I: Iterator> {
    elements: Vec<<I as Iterator>::Item>,
    underlying: I,
}

impl<I: Iterator> Chunks<I> {
    fn new(iterator: I, size: usize) -> Chunks<I> {
        assert!(size > 0);

        let mut result = Chunks {
           underlying: iterator, elements: Vec::with_capacity(size)
        };
        result.refill(size);
        result
    }

    fn refill(&mut self, size: usize) {
        assert!(self.elements.is_empty());

        for _ in 0..size {
            match self.underlying.next() {
                Some(item) => self.elements.push(item),
                None => break,
            }
        }
    }
}

impl<I: Iterator> Iterator for Chunks<I> {
    type Item = Vec<<I as Iterator>::Item>;

    fn next(&mut self) -> Option<Self::Item> {
        if self.elements.is_empty() {
            return None;
        }

        let new_elements = Vec::with_capacity(self.elements.len());
        let result = std::mem::replace(&mut self.elements, new_elements);

        self.refill(result.len());

        Some(result)
    }
}

fn main() {
    let v = vec!(1, 2, 3, 4, 5);

    for slice in Chunks::new(v.iter(), 2) {
        println!("{:?}", slice);
    }
}

Will return:

[1, 2]
[3, 4]
[5]

The canny reader will realize that I surreptitiously switched from windows to chunks.

windows is more difficult, because it returns the same element multiple times which require that the element be Clone. Also, since it needs returning a full Vec each time, it will need internally to keep a Vec<Vec<Iterator::Item>>.

This is left as an exercise to the reader.

Finally, a note on performance: all those allocations are gonna hurt (especially in the windows case).

The best allocation strategy is generally to allocate a single chunk of memory and then live off that (unless the amount is really massive, in which case streaming is required).

It's called collect::<Vec<_>>() in Rust.

And since the Vec has a chunks and windows methods (by virtue of implementing Deref<Target=[T]>), you can then use that instead:

for slice in v.iter().collect::<Vec<_>>().chunks(2) {
    println!("{:?}", slice);
}

for slice in v.iter().collect::<Vec<_>>().windows(2) {
    println!("{:?}", slice);
}

Sometimes the best solutions are the simplest.

Sorry to downvote, but *The exact syntax requested is impossible in Rust* is not true; please check [my answer](http://stackoverflow.com/a/42139758/155423). Most of the rest of your analysis makes sense though. — Shepmaster, Feb 09 '17 at 14:50
@Shepmaster: Your answer does not have the exact syntax requested either. The request is `for (prev, next) in some_iter.windows(2)`, with 2 a runtime parameter, which I interpret as meaning that I could pass 3 and have `for (n0, n1, n2) in some_iter.windows(3)` and that's impossible. You chose to focus on `(prev, next)` and ignore the runtime parameter, it may be OK with the OP, but as far as I am concerned this is not what they asked for (and I don't read minds). — Matthieu M., Feb 09 '17 at 15:10
A good point. specifying *both* the tuple size and an argument to `windows` would not make sense, especially if there was a mismatch. I might encourage you to draw explicit attention to that in your answer - perhaps adding an example? — Shepmaster, Feb 09 '17 at 15:17
@Shepmaster: I am not quite sure what kind of example you mean; I already cited that types cannot depend on values unless one uses Dependent Typing and I am at a loss at how to illustrate it to be honest. Maybe it's not that important since your answer is clearly better. — Matthieu M., Feb 09 '17 at 15:26
Your answer makes a lot of sense, but I cannot help but feel that the lack of `chunks` and `windows` on iterators is an issue. As you had already so, it's actually really easy to implement `chunks` with a dynamic allocation of only c elements where c is the size of the chunk; in contrast using `collect::>` and then chunking means we need a dynamic allocation of n elements where n is the total number of elements produced by the iterators. Your `chunks` example can be optimized to do a single allocation, so performance won't hurt. — kccqzy, Nov 18 '18 at 10:14
> `This is left as an exercise to the reader.` the `windows` functionality described here is exactly what I am looking for but I am not sure how to implement it, still new to Rust. Is there an example? — user5359531, Aug 13 '19 at 19:17

Ross MacArthur · Accepted Answer · 2022-12-01T17:36:56.817

On nightly

The chunks version is now available on nightly under the name array_chunks

#![feature(iter_array_chunks)]

for [a, b, c] in some_iter.array_chunks() {
    ...
}

And it handles remainders nicely:

#![feature(iter_array_chunks)]

for [a, b, c] in some_iter.by_ref().array_chunks() {
    ...
}

let rem = some_iter.into_remainder();

On stable

Since Rust 1.51 this is possible with const generics where the iterator yields constant size arrays [T; N] for any N.

I built two standalone crates which implement this:

iterchunks provides array_chunks()
iterwindows provides array_windows()

use iterchunks::IterChunks; // 0.2

for [a, b, c] in some_iter.array_chunks() {
    ...
}

use iterwindows::IterWindows; // 0.2

for [prev, next] in some_iter.array_windows() {
    ...
}

Using the example given in the Itertools answer:

use iterchunks::IterChunks; // 0.2

fn main() {
    let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();

    for [prev, next] in some_iter.array_chunks() {
        println!("{}--{}", prev, next);
    }
}

This outputs

1--2
3--4
5--6

Most times the array size can be inferred but you can also specific it explicitly. Additionally, any reasonable size N can be used, there is no limit like in the Itertools case.

use iterwindows::IterWindows; // 0.2

fn main() {
    let mut iter = vec![1, 2, 3, 4, 5, 6].into_iter().array_windows::<5>();
    println!("{:?}", iter.next());
    println!("{:?}", iter.next());
    println!("{:?}", iter.next());
}

This outputs

Some([1, 2, 3, 4, 5])
Some([2, 3, 4, 5, 6])
None

Note: array_windows() uses clone to yield elements multiple times so its best used for references and cheap to copy types.

note that this will drop off any remainder entries if not divisible by the given chunk size. — Bots Fab, Apr 06 '22 at 22:31

Are there equivalents to slice::chunks/windows for iterators to loop over pairs, triplets etc?

3 Answers3

On nightly

On stable

Linked

Related