2

The Read trait is implemented for &[u8]. How can I get a Read trait over several concatenated u8 slices without actually doing any concatenation first?

If I concatenate first, there will be two copies -- multiple arrays into a single array followed by copying from single array to destination via the Read trait. I would like to avoid the first copying.

I want a Read trait over &[&[u8]] that treats multiple slices as a single continuous slice.

fn foo<R: std::io::Read + Send>(data: R) {
    // ...
}

let a: &[u8] = &[1, 2, 3, 4, 5];
let b: &[u8] = &[1, 2];
let c: &[&[u8]] = &[a, b];

foo(c); // <- this won't compile because `c` is not a slice of bytes.
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Yuri Astrakhan
  • 8,808
  • 6
  • 63
  • 97
  • You seem confused. The `Read` trait *fundamentally* involves a copy into the destination buffer(s). (I assume you already know that `self` is normally a file-like object) – o11c Apr 08 '22 at 17:48
  • 1
    @o11c of course it is -- I just don't want one extra copying of multiple arrays into a single (concatenated) array from which the read interface will copy. (i clarified the question) – Yuri Astrakhan Apr 08 '22 at 17:58
  • 1
    I’m not sure how [`chain`](https://doc.rust-lang.org/std/io/struct.Chain.html) would perform, but it might be worth a look. – Ry- Apr 08 '22 at 18:08
  • `Read::chain` would require dynamic allocation, similar to that found in [Creating Diesel.rs queries with a dynamic number of .and()'s](https://stackoverflow.com/q/48696290/155423) (and questions linked to it). – Shepmaster Apr 08 '22 at 18:11

2 Answers2

2

You could use the multi_reader crate, which can concatenate any number of values that implement Read:

let a: &[u8] = &[1, 2, 3, 4, 5];
let b: &[u8] = &[1, 2];
let c: &[&[u8]] = &[a, b];

foo(multi_reader::MultiReader::new(c.iter().copied()));

If you don't want to depend on an external crate, you can wrap the slices in a struct of your own and implement Read for it:

struct MultiRead<'a> {
    sources: &'a [&'a [u8]],
    pos_in_current: usize,
}

impl<'a> MultiRead<'a> {
    fn new(sources: &'a [&'a [u8]]) -> MultiRead<'a> {
        MultiRead {
            sources,
            pos_in_current: 0,
        }
    }
}

impl Read for MultiRead<'_> {
    fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
        let current = loop {
            if self.sources.is_empty() {
                return Ok(0); // EOF
            }
            let current = self.sources[0];
            if self.pos_in_current < current.len() {
                break current;
            }
            self.pos_in_current = 0;
            self.sources = &self.sources[1..];
        };
        let read_size = buf.len().min(current.len() - self.pos_in_current);
        buf[..read_size].copy_from_slice(&current[self.pos_in_current..][..read_size]);
        self.pos_in_current += read_size;
        Ok(read_size)
    }
}

Playground

user4815162342
  • 141,790
  • 18
  • 296
  • 355
0

Create a wrapper type around the slices and implement Read for it. Compared to user4815162342's answer, I delegate down to the implementation of Read for slices:

use std::{io::Read, mem};

struct Wrapper<'a, 'b>(&'a mut [&'b [u8]]);

impl<'a, 'b> Read for Wrapper<'a, 'b> {
    fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
        let slices = mem::take(&mut self.0);

        match slices {
            [head, ..] => {
                let n_bytes = head.read(buf)?;

                if head.is_empty() {
                    // Advance the child slice
                    self.0 = &mut slices[1..];
                } else {
                    // More to read, put back all the child slices
                    self.0 = slices;
                }

                Ok(n_bytes)
            }
            _ => Ok(0),
        }
    }
}

fn main() {
    let parts: &mut [&[u8]] = &mut [b"hello ", b"world"];
    let mut w = Wrapper(parts);

    let mut buf = Vec::new();
    w.read_to_end(&mut buf).unwrap();
    assert_eq!(b"hello world", &*buf);
}

A more efficient implementation would implement further methods from Read, such as read_to_end or read_vectored.

See also:

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • This is a very elegant implementation, but it requires `&mut [&[u8]]`, which is similar, but not quite the same as what the OP was asking about ("a `Read` trait over `&[&[u8]]`"). It is of course possible that the OP is unsure about the difference and that they could easily make the outer slice mutable, but it's not obvious that that's the case. – user4815162342 Apr 09 '22 at 10:21
  • Also, this implementation doesn't handle [empty inner slices](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=4807fb55efaf343476e98d9d919cb3cb), which I noticed because mine had the same issue. – user4815162342 Apr 09 '22 at 10:36