2

I'm trying to read a file that is formatted like this:

ruby 2.6.2
elixir 1.8.3

And convert into a two-dimensional array like this pseudocode:

[
  ["ruby", "2.6.2"]
  ["elixir", "1.8.3"]
]

The code I have to do this in Rust is:

use std::fs::File;
use std::io::prelude::*;
use std::io::{self, BufReader};

use std::path::Path;

pub fn present() -> bool {
    path().exists()
}

pub fn parse() -> Vec<Vec<String>> {
    let f = BufReader::new(file().unwrap());
    f.lines()
        .map(|line| {
            line.unwrap()
                .split_whitespace()
                .map(|x| x.to_string())
                .collect()
        })
        .collect()
}

fn file() -> io::Result<File> {
    let f = File::open(path())?;
    return Ok(f);
}

fn path<'a>() -> &'a Path {
    return Path::new(".tool-versions");
}

What I'm unsure of here, is this line in the middle of the parse function:

.map(|x| x.to_string())

This seems like a bit of "overwork", but I am not sure if my feeling is correct here.

Is there something I'm missing here, or is this the cleanest way to write this code to accomplish this particular task?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Ryan Bigg
  • 106,965
  • 23
  • 235
  • 261
  • take [nom](https://github.com/Geal/nom) or similar and do some proper parsing, here that a quick and dirty parsing, it's ok but no in a big or long term project. – Stargateur Aug 06 '19 at 23:56
  • 3
    There are several non-idiomatic aspects to this code, but if you are interested in a full review, then you are better off [asking on code review](https://codereview.meta.stackexchange.com/questions/5777/a-guide-to-code-review-for-stack-overflow-users). – Shepmaster Aug 07 '19 at 01:21
  • 2
    Please don't use `Path::exists` except in rare occasions: [How to check whether a path exists?](https://stackoverflow.com/q/32384594/155423) – Shepmaster Aug 07 '19 at 01:23

2 Answers2

3

Your code is generally idiomatic and good, I just have a few minor caveats.

As far as the "overwork", I would argue modifying the string in-place is overwork, since at every removed whitespace character, you either need to do one of 3 things:

  1. Shift every character past that element down 1 (most moves, no allocations required though).
  2. Store an index of every position, iterate in reverse order and shift each element down 1 (less moves, but requires an allocation).
  3. Store an index of every position, track the index of the start current whitespace block, the end of the current whitespace block and the start of the next block,and move every element down, and then tracking these shifts (most complex, least moves required, and still more computational expensive than needed).

Or.. you could allocate a new string. Sometimes simplicity is powerful.

For the rest, one major issue is not to panic unnecessarily, especially not from failing to open a file. Unwraps, unless you can prove they shouldn't occur, are not good for production code. Specifically, the following line might induce a panic.

let f = BufReader::new(file().unwrap());

Better to replace that function with:

pub fn parse() -> io::Result<Vec<Vec<String>>> {
    let f = BufReader::new(file()?);
    f.lines()
        .map(|line| {
            line.map(|x| {
                x
                    .split_whitespace()
                    .map(|x| x.to_string())
                    .collect()
            })
        })
        .collect()
}

This way, the caller of parse can appropriately handle any errors, both during BufReader creation and any errors that occur during lines().

Alex Huszagh
  • 13,272
  • 3
  • 39
  • 67
0

I believe the .to_string() is necessary. But I would change it for clarity.

The .lines() function returns an iterator with a String. But the function split_whitespace returns a SplitWhiteSpace struct. If you look in the source code at line 4213, then you see that's iterating with str.

Because your function returns a String and not an str, you need to convert it to a String. That can be accomplished by the .to_string() function, but I think it'd be cleaner if you were to use String::from().

The result would be:

.map(|x| String::from(x))

This step is not only necessary for the type, but also for lifetimes. The str's you get back have the same lifetime as the SplitWhiteSpace struct. The conversion to String copies the values which allows them to have a different lifetime.

(Take that explanation with a grain of salt, this is still quite new for me too)

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Geoxion
  • 448
  • 1
  • 4
  • 14