2

My questions seems to be closely related to Rust error "cannot infer an appropriate lifetime for borrow expression" when attempting to mutate state inside a closure returning an Iterator, but I think it's not the same. So, this

use std::iter;                                                                  

fn example(text: String) -> impl Iterator<Item = Option<String>> {            
    let mut i = 0;   
    let mut chunk = None;   
    iter::from_fn(move || {   
        if i <= text.len() {   
            let p_chunk = chunk;
            chunk = Some(&text[..i]);   
            i += 1;   
            Some(p_chunk.map(|s| String::from(s))) 
        } else {   
            None   
        }   
    })   
}   

fn main() {}

does not compile. The compiler says it cannot determine the appropriate lifetime for &text[..i]. This is the smallest example I could come up with. The idea being, there is an internal state, which is a slice of text, and the iterator returns new Strings allocated from that internal state. I'm new to Rust, so maybe it's all obvious, but how would I annotate lifetimes so that this compiles?

Note that this example is different from the linked example, because there point was passed as a reference, while here text is moved. Also, the answer there is one and half years old by now, so maybe there is an easier way.

EDIT: Added p_chunk to emphasize that chunk needs to be persistent across calls to next and so cannot be local to the closure but should be captured by it.

Andrei
  • 2,585
  • 1
  • 14
  • 17
  • 1
    This is an example of attempting to create a [self-referential struct](https://stackoverflow.com/q/32300132/1600898). `text` is moved into the closure, and `chunk`, also moved into the closure, refers to the contents of `text`. The result is a self-referential struct, which is not supported by the current borrow checker. (While self-referential structs are unsafe in general, in _this case_ it would be safe because `text` is heap-allocated, not mutated, and doesn't escape the closure, so a sufficiently smart borrow checker could prove that what you're trying to do is safe.) – user4815162342 Jun 13 '21 at 17:44
  • It's not immediately clear if that's causing the error, though quite possible. The answer to the discussion you cite says that referencing through an Option is possible but the structure cannot be moved afterwards. In my case, the self-reference is created *after* `text` and `chunk` were moved in place, and they are never moved again, so in principle it should work. But if it's indeed an edge case for a borrow checker, I should probably submit a bug. – Andrei Jun 14 '21 at 07:21
  • It should work in principle, but it is well known that the current borrow checker doesn't support it. (The support would require multiple new features: the borrow checker should special-case heap-allocated types like `Box` or `String` whose moves don't affect references into their contnet, *and* in this case also prove that you don't resize or `mem::replace()` the closed-over `String`.) You can of course still submit a bug report, but you can be sure that all the above limitations are certainly already familiar to devs. – user4815162342 Jun 14 '21 at 07:28
  • Thanks for clarifying. Maybe post your comment as an answer so that I can close the question? – Andrei Jun 14 '21 at 07:46
  • I originally refrained from answering because there are already many questions and answers regarding self-referential structs, but there is now sufficient content about _this_ case that it probably deserves an answer, so I've now posted one. – user4815162342 Jun 14 '21 at 07:54

2 Answers2

1

If you move the chunk Option into the closure, your code compiles. I can't quite answer why declaring chunk outside the closure results in a lifetime error for the borrow of text inside the closure, but the chunk Option looks superfluous anyways and the following code should be equivalent:

fn example(text: String) -> impl Iterator<Item = Option<String>> {
    let mut i = 0;
    iter::from_fn(move || {
        if i <= text.len() {
            let chunk = text[..i].to_string();
            i += 1;
            Some(Some(chunk))
        } else {
            None
        }
    })
}

Additionally, it seems unlikely that you really want an Iterator<Item = Option<String>> here instead of an Iterator<Item<String>>, since the iterator never yields Some(None) anyways.

fn example(text: String) -> impl Iterator<Item = String> {
    let mut i = 0;
    iter::from_fn(move || {
        if i <= text.len() {
            let chunk = text[..i].to_string();
            i += 1;
            Some(chunk)
        } else {
            None
        }
    })
}

Note, you can also go about this iterator without allocating a String for each chunk, if you take a &str as an argument and tie the lifetime of the output to the input argument:

fn example<'a>(text: &'a str) -> impl Iterator<Item = &'a str> + 'a {
    let mut i = 0;
    iter::from_fn(move || {
        if i <= text.len() {
            let chunk = &text[..i];
            i += 1;
            Some(chunk)
        } else {
            None
        }
    })
}
sebpuetz
  • 2,430
  • 1
  • 7
  • 15
  • Hi. I'll try to revise my example as I think I didn't manage to articulate well what I'm aiming for. I need the chunk to be declared outside, because the chunk needs to keep the state from a previous call to `next`. (Though now that I think of it, I can just keep the integers delimiting the chunk start and and positions.) Extra Option<> and moving instead referencing are unclear in this example, but I need those for the original code from which this example was simplified. – Andrei Jun 13 '21 at 11:35
  • Yes, just closing over the start and end of chunk, and constructing the slice locally works. I'll leave my question open though. It should be possible somehow to close over the slice... – Andrei Jun 13 '21 at 12:38
1

Your code is an example of attempting to create a self-referential struct, where the struct is implicitly created by the closure. Since both text and chunk are moved into the closure, you can think of both as members of a struct. As chunk refers to the contents in text, the result is a self-referential struct, which is not supported by the current borrow checker.

While self-referential structs are unsafe in general due to moves, in this case it would be safe because text is heap-allocated and is not subsequently mutated, nor does it escape the closure. Therefore it is impossible for the contents of text to move, and a sufficiently smart borrow checker could prove that what you're trying to do is safe and allow the closure to compile.

The answer to the [linked question] says that referencing through an Option is possible but the structure cannot be moved afterwards. In my case, the self-reference is created after text and chunk were moved in place, and they are never moved again, so in principle it should work.

Agreed - it should work in principle, but it is well known that the current borrow checker doesn't support it. The support would require multiple new features: the borrow checker should special-case heap-allocated types like Box or String whose moves don't affect references into their content, and in this case also prove that you don't resize or mem::replace() the closed-over String.

In this case the best workaround is the "obvious" one: instead of persisting the chunk slice, persist a pair of usize indices (or a Range) and create the slice when you need it.

user4815162342
  • 141,790
  • 18
  • 296
  • 355