1

Currently, I'm using this function in my code:

fn lines_from_file(filename: impl AsRef<Path>) -> Vec<String> {
    let file = File::open(filename).expect("no such file");
    let buf = BufReader::new(file);
    buf.lines().map(|l| l.expect("Could not parse line")).collect()
}

How can I safely read the last x lines only in the file?

Brandon Kauffman
  • 1,515
  • 1
  • 7
  • 33
  • 1
    Memory map the file, scan backwards successively for newlines `x + 1` times, pull out everything after that last newline and convert it to lines? [Same basic solution in any language](https://stackoverflow.com/a/34029605/364696). – ShadowRanger Nov 01 '22 at 21:11

2 Answers2

1

The tail crate claims to provide an efficient means of reading the final n lines from a file by means of the BackwardsReader struct, and looks fairly easy to use. I can't swear to its efficiency (it looks like it performs progressively larger reads seeking further and further back in the file, which is slightly suboptimal relative to an optimized memory map-based solution), but it's an easy all-in-one package and the inefficiencies likely won't matter in 99% of all use cases.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • I don't use rust enough (yet) to know if there's a generator equivalent (I'm sure there is) but a memory map solution seems potentially memory intensive. Especially in the case of a buf reader that doesn't know the length of a file. – Brandon Kauffman Nov 01 '22 at 23:14
  • 1
    @BrandonKauffman: Memory mapping is *virtual* memory address space intensive (I don't recommend it if you're on a 32 bit system), but it's the opposite of RAM intensive; data is paged in solely on demand, a page at a time (the OS might prefetch a few pages beyond those accessed based on access patterns, more if you use an API like `madvise` to ask for more), everything else is pages that will be populated on demand (consuming no RAM until used, easily dropped if under memory pressure since they can always be reread from disk). – ShadowRanger Nov 01 '22 at 23:18
1

To avoid storing files in memory (because the files were quite large) I chose to use rev_buffer_reader and to only take x elements from the iterator

fn lines_from_file(file: &File, limit: usize) -> Vec<String> {
    let buf = RevBufReader::new(file);
    buf.lines().take(limit).map(|l| l.expect("Could not parse line")).collect()
}
Brandon Kauffman
  • 1,515
  • 1
  • 7
  • 33