2

I'm trying to create a peek_while method for my iterator which should basically do the same as take_while but only consume the character after the predicate matched. I've taken some inspiration from https://stackoverflow.com/a/30540952/10315665 and from the actual take_while source code https://github.com/rust-lang/rust/blob/2c7bc5e33c25e29058cbafefe680da8d5e9220e9/library/core/src/iter/adapters/take_while.rs#L42-L54 and arrived at this result:

pub struct PeekWhile<I: Iterator, P> {
    iter: Peekable<I>,
    predicate: P,
}

impl<I, P> fmt::Debug for PeekWhile<I, P>
where
    I: Iterator + Debug,
    <I as Iterator>::Item: Debug,
{
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.debug_struct("PeekWhile")
            .field("iter", &self.iter)
            .finish()
    }
}

impl<I, P> Iterator for PeekWhile<I, P>
where
    I: Iterator,
    P: FnMut(&I::Item) -> bool,
{
    type Item = I::Item;

    fn next(&mut self) -> Option<I::Item> {
        let n = self.iter.peek()?;
      
        while let Some(n) = self.iter.peek() {
            if (self.predicate)(&n) {
                return self.iter.next();
            } else {
                break;
            }
        }
        None
    }
}

pub trait PeekWhileExt: Iterator {
    fn peek_while<P>(self, predicate: P) -> PeekWhile<Self, P>
    where
        Self: Iterator,
        Self: Sized,
        P: FnMut(&Self::Item) -> bool,
    {
        PeekWhile {
            iter: self.peekable(),
            predicate,
        }
    }
}

impl<I: Iterator> PeekWhileExt for I {}

This is causing an infinite loop, and while I'm not sure why, I see that the take_while::next method does not have a loop, so I changed that to:

let n = self.iter.peek()?;

if (self.predicate)(n) {
    Some(n)
} else {
    None
}

which now gives me:

mismatched types
expected associated type `<I as Iterator>::Item`
         found reference `&<I as Iterator>::Item`

So how would I go about creating such an iterator? Is the code so far correct, and how to I complete it? I know that itertools has https://docs.rs/itertools/0.10.1/itertools/trait.Itertools.html#method.peeking_take_while which sounds promising, but this is a learning project, both in programming concepts and rust itself (I'm creating a JSON parser btw), so I would very much be interested in completing that portion of the code without any libraries.

Example use case (not tested):

let chars = "keyword:".chars();

assert_eq!(chars.peek_while(|c| c.is_alphabetic()).collect::<String>(), "keyword");
assert_eq!(chars.next().unwrap(), ':');
//                                 ^
// Very important that the next char doesn't get lost

Appreciate any help!

Elias
  • 3,592
  • 2
  • 19
  • 42
  • [`Peekable::peek`](https://doc.rust-lang.org/std/iter/struct.Peekable.html#method.peek) returns an `&Item`, which makes perfect sense because the item needs to remain available until the `Peekable` is advanced (for any repeat calls to `peek` and indeed the eventual call to `next`). Similarly, because you want the item to remain available after peeking and unless `Item: Copy or Clone`, your `PeekWhile` can only really produce `&Item`s rather than `Item`s. – eggyal Sep 15 '21 at 20:09
  • But there's a bigger problem here: `Peekable:peek` advances the underlying iterator (and indeed that's the only way that one can peek its next item). So you need to rethink the design. Perhaps `PeekWhileExt` should only be implemented for `Peekable`s? – eggyal Sep 15 '21 at 20:11
  • @eggyal I'm open to re-thinking the design. Originally I didn't use iterators but simply passed the string and the current index around. Now I started refactoring to use iterators which (so far) is much cleaner-looking code. And yes, `PeekWhileExt` should only be implemented for `Peekable`s. The issue I faced while declaring this though, is that `Peekable` is an interface and not a trait. Thus this code :) – Elias Sep 15 '21 at 20:14
  • 1
    "This is causing an infinite loop" – While your code does not do what you want it to do, I don't think it can loop forever. [Playground example](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=835f53159699ccf6ba93a6a189dc018a) – Elias Holzmann Sep 15 '21 at 20:17
  • @EliasHolzmann ahh... indeed, you are right. So no infinity loop, but definitely not doing what I want. From what you sent it looks like it acts like the normal `take_while`? – Elias Sep 15 '21 at 20:23
  • To clarify: "Originally I didn't use iterators but simply passed the string and the current index around.", I was referencing the JSON parser this code will be used in. – Elias Sep 15 '21 at 20:28
  • "is that Peekable is an interface and not a trait" not `interface` but `struct`. – Elias Sep 15 '21 at 20:44
  • I don't understand what `peek_while` should do. Can you elaborate on "only consume the character after the predicate matched"? Does that mean that you want to call `predicate(next_item)` over and over until `predicate` returns `false`, then yield `next_item`? That *would* cause an infinite loop unless `predicate` were designed to be stateful, but that isn't what *this* code does, and it doesn't seem to match your description of the problem. Or do you mean `PeekWhile` should *yield* the same thing over and over again until the `predicate` becomes false? – trent Sep 15 '21 at 21:48
  • I notice that the `while let` in `fn next` could be an `if let`, because it never loops - is that what you meant to do? – trent Sep 15 '21 at 21:50
  • @trentcl I want to only consume the value after the predicate matched. Otherwise, one value always gets consumed, but not added to the result (the behavior I **don't** want is documented here: https://github.com/rust-lang/rust/blob/4aed1abb706a31331fd8b612ca400935aeddd0e0/library/core/src/iter/traits/iterator.rs#L1076-L1097). – Elias Sep 16 '21 at 03:34
  • @eggyal How would I go about only implementing this for `Peekable`? As I (now) understand it the issue here is that I'm always creating a new `Peekable`, thus the stored peeked value gets lost. – Elias Sep 16 '21 at 04:03
  • 1
    You can implement it if you're willing to iterate over copies or clones of the underlying iterator's items—but due to the signature of the `Iterator` trait, it's not possible to have an iterator over borrows from `self`: for that you'd need a so-called "streaming" iterator instead (there's a [crate](https://crates.io/crates/streaming-iterator)). For the specific case given in your question, `Copy` should be fine because you're only iterating `char`: see [playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8aa708701b2587483a60ba607156efbd). – eggyal Sep 16 '21 at 08:26
  • @eggyal The thing is that yes, I'm only iterating over `char`, to be precise this is the tokenizer part of the parser, but I was going to use the same `peek_while` iterator for treebuilder, which would iterate over `Token`. I actually had an idea today that I could create a `PeekableTrait` myself and implement it for `Peekable`? I'm at work right now, so I will test this theory in the evening. – Elias Sep 16 '21 at 08:29
  • 1
    ([Minor revision](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=ae06086c0e92bbdbdb584253c4d31a91) to playground in previous comment). If `Token: Copy` then this should be okay for you. But if you really do want references, then you'll need to use a streaming iterator instead. – eggyal Sep 16 '21 at 08:37
  • @eggyal or just create my own custom iterator, in which I can actually advance and revert as I please... I was hoping to get a more... "native" solution. – Elias Sep 16 '21 at 08:43
  • 1
    Just realised that [`Peekable::next_if`](https://doc.rust-lang.org/std/iter/struct.Peekable.html#method.next_if) exists, which means you can actually drop the `Copy` requirement: [playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=1e3389218d5a538cccd7a8736cf5546d). But nevertheless, as you say, parsing probably requires more backtracking than pure iteration allows. Sounds like you need a different model altogether. Have you looked into any of the parsing libraries, like [pest](https://crates.io/crates/pest)? – eggyal Sep 16 '21 at 08:51
  • @eggyal I have not! Definitely a good place for references, thank you! And as to the backtracting. I've already implemented the code without iterators, and if I remember correctly it seems to work pretty okay without it. In case you are interested: https://elias-graf.github.io/lazyjson (it's compiled to wasm, if you're really bored you can look at the actual code on my github lol). – Elias Sep 16 '21 at 08:55
  • @eggyal As I see it `next_if` pretty much solves this. Thanks for helping ❤️ Feel free to create an answer if you want to. – Elias Sep 16 '21 at 16:28

0 Answers0