5

I am trying to iterate over characters in stdin. The Read.chars() method achieves this goal, but is unstable. The obvious alternative is to use Read.lines() with a flat_map to convert it to a character iterator.

This seems like it should work, but doesn't, resulting in borrowed value does not live long enough errors.

use std::io::BufRead;

fn main() {
    let stdin = std::io::stdin();
    let mut lines = stdin.lock().lines();
    let mut chars = lines.flat_map(|x| x.unwrap().chars());
}

This is mentioned in Read file character-by-character in Rust, but it does't really explain why.

What I am particularly confused about is how this differs from the example in the documentation for flat_map, which uses flat_map to apply .chars() to a vector of strings. I don't really see how that should be any different. The main difference I see is that my code needs to call unwrap() as well, but changing the last line to the following does not work either:

let mut chars = lines.map(|x| x.unwrap());
let mut chars = chars.flat_map(|x| x.chars());

It fails on the second line, so the issue doesn't appear to be the unwrap.

Why does this last line not work, when the very similar line in the documentation doesn't? Is there any way to get this to work?

Community
  • 1
  • 1
Ian D. Scott
  • 463
  • 3
  • 11

1 Answers1

7

Start by figuring out what the type of the closure's variable is:

let mut chars = lines.flat_map(|x| {
    let () = x;
    x.unwrap().chars()
});

This shows it's a Result<String, io::Error>. After unwrapping it, it will be a String.

Next, look at str::chars:

fn chars(&self) -> Chars

And the definition of Chars:

pub struct Chars<'a> {
    // some fields omitted
}

From that, we can tell that calling chars on a string returns an iterator that has a reference to the string.

Whenever we have a reference, we know that the reference cannot outlive the thing that it is borrowed from. In this case, x.unwrap() is the owner. The next thing to check is where that ownership ends. In this case, the closure owns the String, so at the end of the closure, the value is dropped and any references are invalidated.

Except the code tried to return a Chars that still referred to the string. Oops. Thanks to Rust, the code didn't segfault!

The difference with the example that works is all in the ownership. In that case, the strings are owned by a vector outside of the loop and they do not get dropped before the iterator is consumed. Thus there are no lifetime issues.

What this code really wants is an into_chars method on String. That iterator could take ownership of the value and return characters.


Not the maximum efficiency, but a good start:

struct IntoChars {
    s: String,
    offset: usize,
}

impl IntoChars {
    fn new(s: String) -> Self {
        IntoChars { s: s, offset: 0 }
    }
}

impl Iterator for IntoChars {
    type Item = char;

    fn next(&mut self) -> Option<Self::Item> {
        let remaining = &self.s[self.offset..];

        match remaining.chars().next() {
            Some(c) => {
                self.offset += c.len_utf8();
                Some(c)
            }
            None => None,
        }
    }
}

use std::io::BufRead;

fn main() {
    let stdin = std::io::stdin();
    let lines = stdin.lock().lines();
    let chars = lines.flat_map(|x| IntoChars::new(x.unwrap()));

    for c in chars {
        println!("{}", c);
    }
}

See also:

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • Ah, I see, thanks! I suppose what I was confused by is the fact that the functions all deal with `String` rather than `&str`, so it seems like it should be moving the values. But that is not the case since the closures aren't returning the actual values, but rather iterators that are later lazily evaluated, and those iterators contain references to the original object. – Ian D. Scott Nov 01 '16 at 01:03
  • 2
    @IanD.Scott While `chars` can be called on a `String`, note that it takes `&self` (a reference) and that it's actually implemented via `Deref`, meaning that the implementation is actually on `str`. Thus `&self` => `&str`. – Shepmaster Nov 01 '16 at 01:07
  • 2
    Thank you for the `let () = x;` trick for determining the variable type! – Slava Semushin Nov 02 '16 at 12:56
  • @SlavaSemushin [How do I print the type of a variable in Rust?](http://stackoverflow.com/q/21747136/155423). – Shepmaster Nov 02 '16 at 13:00