Ultimately, you are going to run into a problem: you are trying to write to the same file you are reading from. In this case, it's safe because you are going to read the entire file, so you don't need it after that. However, if you did try to write to the file, you'd see that opening a file for reading doesn't allow writing! Here's the code to do that:
use std::{
fs::File,
io::{BufRead, BufReader, Write},
};
fn main() {
let mut file = File::open("test.txt").expect("file error");
let reader = BufReader::new(&mut file);
let mut lines: Vec<_> = reader
.lines()
.map(|l| l.expect("Couldn't read a line"))
.collect();
lines.sort();
lines.dedup();
for line in lines {
file.write_all(line.as_bytes())
.expect("Couldn't write to file");
}
}
Here's the output:
% cat test.txt
a
a
b
a
% cargo run
thread 'main' panicked at 'Couldn't write to file: Os { code: 9, kind: Other, message: "Bad file descriptor" }', src/main.rs:12:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
You could open the file for both reading and writing:
use std::{
fs::OpenOptions,
io::{BufRead, BufReader, Write},
};
fn main() {
let mut file = OpenOptions::new()
.read(true)
.write(true)
.open("test.txt")
.expect("file error");
// Remaining code unchanged
}
But then you'd see that (a) the output is appended and (b) all the newlines are lost on the new lines because BufRead
doesn't include them.
We could reset the file pointer back to the beginning, but then you'd probably leave trailing stuff at the end (deduplicating is likely to have less bytes written than read). It's easier to just reopen the file for writing, which will truncate the file. Also, let's use a set data structure to do the deduplication for us!
use std::{
collections::BTreeSet,
fs::File,
io::{BufRead, BufReader, Write},
};
fn main() {
let file = File::open("test.txt").expect("file error");
let reader = BufReader::new(file);
let lines: BTreeSet<_> = reader
.lines()
.map(|l| l.expect("Couldn't read a line"))
.collect();
let mut file = File::create("test.txt").expect("file error");
for line in lines {
file.write_all(line.as_bytes())
.expect("Couldn't write to file");
file.write_all(b"\n").expect("Couldn't write to file");
}
}
And the output:
% cat test.txt
a
a
b
a
a
b
a
b
% cargo run
% cat test.txt
a
b
The less-efficient but shorter solution is to read the entire file as one string and use str::lines
:
use std::{
collections::BTreeSet,
fs::{self, File},
io::Write,
};
fn main() {
let contents = fs::read_to_string("test.txt").expect("can't read");
let lines: BTreeSet<_> = contents.lines().collect();
let mut file = File::open("test.txt").expect("can't create");
for line in lines {
writeln!(file, "{}", line).expect("can't write");
}
}
See also: