0

I'm implementing a data compression interface:

pub trait NumericEncoder<V> {
    fn encode(&mut self, value: V) -> io::Result<()>;
}

An encoder can encode some number in some kind of output, where an output might be a stream (file), byte buffer or even another encoder. One might would invoke an implementation like so:

let f = File::create("out").unwrap();
// Delta encoder whose data is run-length-compressed
let mut enc = DeltaEncoder::new(RunLengthEncoder::new(f));
enc.encode(123).unwrap();

That's all fine and good, but in some cases I need multiple encoders against the same output stream. Something like (simplified):

let f = File::create("out")?;
let mut idEnc = RunLengthEncoder::new(DeltaEncoder::new(f));
let mut dataEnc = LZEncoder::new(f);
for (id, data) in input.iter() {
    idEnc.encode(id);
    dataEnc.encode(data);
}

Here, two encoders would be interleaving their data as they're writing it.

This needs mutable access to the same file, which isn't possible with straight &mut references. From what I can tell, the only way to accomplish this is with a RefCell; is there a better way?

As far as I can tell, this would make all the encoder implementations less clean. Right now an encoder can be declared like this:

pub struct MySpecialEncoder<'a, V, W>
where
    W: io::Write,
{
    w: &'a mut W,
    phantom: std::marker::PhantomData<V>,
}

With a RefCell, every encoder struct and constructor would need to deal with Rc<RefCell<W>>, which is not as nice and leaks the sharedness of the writer into the encoder, which shouldn't need to know that the writer is shared.

(I did consider whether I could change the NumericEncoder trait to take a writer argument, which would have to be std::io::Write. This won't work because some encoders don't write to a std::io::Write, but to another NumericEncoder.)

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Alexander Staubo
  • 3,148
  • 2
  • 25
  • 22
  • Why your struct need to hold the reference to the file ? Why not just give it to them when you call encode ? `idEnc.encode(f, id);` `dataEnc.encode(f, data);` this allow more flexibility. – Stargateur Apr 14 '19 at 05:08
  • "This won't work because some encoders don't write to a std::io::Write, but to another NumericEncoder." that not clear. This could need a [mcve]. – Stargateur Apr 14 '19 at 05:10
  • "This won't work because some encoders don't write to a std::io::Write, but to another NumericEncoder" - so why not to implement `NumericEncoder` for `T: io::Write`? Then modify its signature to accept another `NumericEncoder` – Laney Apr 14 '19 at 07:37
  • Idiomatic Rust uses `snake_case` for variables, methods, macros, fields and modules; `UpperCamelCase` for types and enum variants; and `SCREAMING_SNAKE_CASE` for statics and constants. Use `id_enc` / `data_enc` instead, please. – Shepmaster Apr 14 '19 at 12:55
  • These questions made me realize that I was not thinking the signature through. Even though some encoders write to another encoder, and not a `W`, I can of course make `W` part of the signature ( `encode(W, V)`), because encoders can just pass the writer argument to its next encoder instead of using it. This means encoder structs don't need to carry the writer with them. Thanks, @Laney and @Stargateur. – Alexander Staubo Apr 14 '19 at 16:23

1 Answers1

1

the only way to accomplish this is with a RefCell

Any type that grants interior mutability will work. For example, a Mutex is also sufficient.

this would make all the encoder implementations less clean

I don't know why you believe that. Create a type that uses interior mutability and only use that type when you need that extra functionality:

#[derive(Debug)]
struct Funnel<E>(Rc<RefCell<E>>);

impl<E> Funnel<E> {
    fn new(e: E) -> Self {
        Funnel(Rc::new(RefCell::new(e)))
    }
}

impl<E> Clone for Funnel<E> {
    fn clone(&self) -> Self {
        Funnel(self.0.clone())
    }
}

impl<V, E> NumericEncoder<V> for Funnel<E>
where
    E: NumericEncoder<V>,
{
    fn encode(&mut self, value: V) -> io::Result<()> {
        self.0.borrow_mut().encode(value)
    }
}
fn main() -> io::Result<()> {
    let s = Shared;

    let s1 = Funnel::new(s);
    let s2 = s1.clone();

    let mut e1 = Wrapper(s1);
    let mut e2 = Wrapper(s2);

    e1.encode(1)?;
    e2.encode(2)?;

    Ok(())
}

You should also think about taking W by value, and I'm not sure why you need PhantomData — my code didn't.

See also:

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • Thanks, this gives me some stuff to read about! `PhantomData` wasn't so useful in that case, I guess, but I've been using it elsewhere to avoid return-type type inference: I can create a concrete `SomeDecoder` var and know that `let v = decoder.decode()` always returns `io::Result`, instead of `let v: io::Result = decoder.decode()`. It seems that in `let v: u32 = decoder.decode()?`, Rust cannot infer the inner type variable of the `Result`, but I could be wrong. I get a compile error saying my function returns `()`. Same with type inference in `match decoder.decode()`. – Alexander Staubo Apr 14 '19 at 16:47