9

I need to iterate over lines in a string, but keep the newlines at the end in the strings that are yielded.

There is str.lines(), but the strings it returns have the newline characters chopped off:

let result: Vec<_> = "foo\nbar\n".lines().collect();
assert_eq!(result, vec!["foo", "bar"]);

Here's what I need:

assert_eq!(lines("foo\nbar\n"), vec!["foo\n", "bar\n"]);

More test cases:

assert!(lines("").is_empty());
assert_eq!(lines("f"), vec!["f"]);
assert_eq!(lines("foo"), vec!["foo"]);
assert_eq!(lines("foo\n"), vec!["foo\n"]);
assert_eq!(lines("foo\nbar"), vec!["foo\n", "bar"]);
assert_eq!(lines("foo\r\nbar"), vec!["foo\r\n", "bar"]);
assert_eq!(lines("foo\r\nbar\r\n"), vec!["foo\r\n", "bar\r\n"]);
assert_eq!(lines("\nfoo"), vec!["\n", "foo"]);
assert_eq!(lines("\n\n\n"), vec!["\n", "\n", "\n"]);

I have a solution that basically calls find in a loop, but I'm wondering if there's something more elegant.

This is similar to Split a string keeping the separators, but in that case, the characters are returned as separate items, but I want to keep them as part of the string:

["hello\n", "world\n"]; // This
["hello", "\n", "world", "\n"]; // Not this
Community
  • 1
  • 1
robinst
  • 30,027
  • 10
  • 102
  • 108

3 Answers3

9

The solution I currently have looks like this:

/// Iterator yielding every line in a string. The line includes newline character(s).
pub struct LinesWithEndings<'a> {
    input: &'a str,
}

impl<'a> LinesWithEndings<'a> {
    pub fn from(input: &'a str) -> LinesWithEndings<'a> {
        LinesWithEndings {
            input: input,
        }
    }
}

impl<'a> Iterator for LinesWithEndings<'a> {
    type Item = &'a str;

    #[inline]
    fn next(&mut self) -> Option<&'a str> {
        if self.input.is_empty() {
            return None;
        }
        let split = self.input.find('\n').map(|i| i + 1).unwrap_or(self.input.len());
        let (line, rest) = self.input.split_at(split);
        self.input = rest;
        Some(line)
    }
}
robinst
  • 30,027
  • 10
  • 102
  • 108
0

My quick and dirty solution is this:

"foo\nbar\n".lines().map(|x| format!("{}\n", x.unwrap())).collect::<Vec<String>>()

This works, because the format! macro creates a new String for each line, just to include the newline. This is wasteful, and it should be much slower than robinst's solution. It also fails the provided assertions, because you get Strings out of it instead of &strs

attila
  • 1
  • 1
  • 1
    Hey! Yeah that works, but because of the `format!` call, allocates a new `String` object for each line unnecessarily which I avoided in my solution. – robinst Mar 16 '23 at 01:03
  • Thanks, good point, I should've said that. It's the reason why I called it "quick and dirty". – attila Mar 17 '23 at 06:03
0

There is now a beautiful approach, which is still feature-gated, but seems to make its way into stable.

Just add .intersperse("\n") right before collecting the lines, and you're all set.

Note that atm (2023-03-19) it requires to add #![feature(iter_intersperse)] in the beginning of the crate root.

PS Actually it's answer to How do I keep newlines when using lines() and collect() to return a String? which is closed with the link to this one question. I feel like the answer still useful (though solves related issue instead of precise of this question). Maybe mighty moderators could reopen the linked question and move this answer there to keep things tidy.

Sergey Kaunov
  • 140
  • 1
  • 8
  • Nit: Features don't go in the beginning of the file you use the feature in. They go in your crate root (i.e. usually your `lib.rs` or `bin.rs`) – Filipe Rodrigues Mar 19 '23 at 18:37
  • Won't that produce `["hello", "\n", "world", "\n"]` instead of `["hello\n", "world\n"]`? – robinst Mar 20 '23 at 06:17
  • @FilipeRodrigues, thank you -- updated the answer. I should have double-checked it, but relied on my memory. (= – Sergey Kaunov Mar 20 '23 at 11:02
  • @robinst, exactly. That's how PS in the end was born. I actually got to this question from the one I linked (which is closed, unfortunately), and after adding the answer took a moment to read the open question again and find out that this helpful addition doesn't exactly solve **this** one problem. – Sergey Kaunov Mar 20 '23 at 11:05