1

I've a situation where I have a destination Vec to which I want to push exactly n bytes I read from a Read object.

Read, however, generally deals with slices. The only method which uses Vec is read_to_end, but I need to "bound" the reading to only n bytes.

So far I can see 3 ways to implement this, but I'm not sure if I'm missing something, or which one is the best (or at least the least bad): playground

use read_to_end anyway

This is the shortest by far, but it also involves the most stdlib understanding, and I'm not sure I grasp the semantics completely / correctly:

fn read1<T: Read>(mut reader: T, n: usize, sink: &mut Vec<u8>) -> Result<()> {
    reader.take(n as u64).read_to_end(sink)?;
    Ok(())
}

If this were not a utility function I think by_ref would also be useful in order to not consume the source reader?

extend the Vec and read into that buffer

This is way more involved and has the issue that we're adding completely invalid data to the Vec in order to have a proper "slice" to read into:

fn read2<T: Read>(mut reader: T, n: usize, sink: &mut Vec<u8>) -> Result<()> {
    let before = sink.len();
    let after = sink.len() + n;
    sink.resize(after, 0);
    if let Err(e) = reader.read_exact(&mut sink[before..after]) {
        sink.resize(before, 0);
        return Err(e.into());
    }
    Ok(())
}

This assumes read_exact and resize (down) don't panic.

use a local buffer

This is the most simplistic and complicated: repeatedly read data into a local buffer, then copy from the local buffer into the output vec.

fn read3<T: Read>(mut reader: T, mut n: usize, sink: &mut Vec<u8>) -> Result<()> {
    let before = sink.len();
    let mut buf = [0;64];
    
    while n > 0 {
        let b = &mut buf[0..min(64, n)];
        if let Err(e) = reader.read_exact(b) {
            sink.resize(before, 0);
            return Err(e.into());
        }
        sink.extend(&*b);
        n -= b.len();
    }
    
    Ok(())
}

It's what I'd do in a lower level language but it fells icky. Compared to version 2 it does add data to the output vec which we may need to strip out (I elected to do this here but there may be case where that's unnecessary), but unlike solution 2 while it adds partial data to the vec it's valid data.

Masklinn
  • 34,759
  • 3
  • 38
  • 57
  • `resize` isn't putting *invalid* data in the vec - it is initializing the newly added portion with a known value (0 in this case). – harmic Apr 07 '21 at 07:37
  • Per the [second answer](https://stackoverflow.com/a/30413877) you seem to want to use `Read::take` followed by `Read::read_to_end`. – E_net4 Apr 07 '21 at 09:03
  • 1
    @harmic it's not invalid from a memory safety standpoint, but it's basically garbage from the application's, and there's no chance it will be considered valid if the application expects structured data input (which it probably does), so I think "invalid data" is a correct qualifier, with respect to the application's expectation. – Masklinn Apr 07 '21 at 09:41
  • @E_net4thedownvoter ah so option #1 would be the most favoured answer. Also while the solution is the same, the question is not: my question is not how to read a specific number of bytes, but how to read a specific number of bytes *into a(n existing) Vec*. That was the bit I was unsure of. – Masklinn Apr 07 '21 at 09:42
  • Why do you need to pass a &mut to the buffer, it is is a newly created buffer you could simply create a buffer of `n` size and read all, then return a vector from the buffer itself. – Netwave Apr 07 '21 at 10:03
  • 1
    @Netwave because the goal is to use a caller-provided buffer (e.g. so it can be reused, and maybe to allow custom allocators eventually), but the caller has no idea how much space is ultimately needed. – Masklinn Apr 07 '21 at 11:20
  • 1
    *while the solution is the same, the question is not* — that's the point of duplicates on Stack Overflow and why the closed text says "this question already has **answers** here", not "this is the same question as". This duplicate remains as a signpost for SEO purposes, using the unique phrasing and text that you used. – Shepmaster Apr 07 '21 at 11:38
  • 1
    @Shepmaster that's not what [the help center](https://stackoverflow.com/help/duplicates) hints at. It says that there are many ways to ask *the same question*, it doesn't say that different questions whose answers overlap are duplicates. In fact that doesn't make sense on its face: the flag is to mark *questions* as *duplicates*, not to mark questions as already answered in completely different questions. In fact [handling duplicate questions](https://stackoverflow.blog/2009/04/29/handling-duplicate-questions/) hints that the opposite in talking of "borderline duplicates". – Masklinn Apr 07 '21 at 13:41

0 Answers0