1

I've written a simple helper to loop over nibbles (4 bits) in an u8 slice. It uses an internal iterator over & u8 and essentially doubles the steps, where two steps both refer to the same underlying u8 but filter and shift the bits when viewed.

I created a mutable version as well (not pasted here) using Rc and RefCell, which requires an underlying iterator over &mut u8. However I would like the read-only version to also work with iterators that provide mutable access.

I've tried using I: 'a + Borrow<u8>, T: Iterator<Item = I> instead of the hard-coded &'a u8 and AsRef<u8> as well, but failed because with the inner byte becoming a non-reference, the borrowing occurs in my next() method where the borrowed values would escape their closure.

What would be required to allow my Nibbler to work with iterators that either iterate over &u8 or &mut u8?

pub enum Nibble<'a> {
    MSB(&'a u8),
    LSB(&'a u8),
}

impl Nibble<'_> {
    pub fn from_u8(input: &u8) -> (Nibble, Nibble) {
        let msb = Nibble::MSB(input);
        let lsb = Nibble::LSB(input);
        (msb, lsb)
    }

    pub fn get(&self) -> u8 {
        match self {
            Nibble::MSB(r) => (**r & 0b11110000) >> 4,
            Nibble::LSB(r) => **r & 0b00001111,
        }
    }
}

pub struct Nibbler<'a, T> {
    rest: Option<Nibble<'a>>,
    inner: T,
}

impl<T> Nibbler<'_, T> {
    pub fn new(inner: T) -> Self {
        Nibbler { inner, rest: None }
    }
}

impl<'a, T: Iterator<Item = &'a u8>> Iterator for Nibbler<'a, T> {
    type Item = Nibble<'a>;

    fn next(&mut self) -> Option<Self::Item> {
        self.rest.take().or_else(|| {
            self.inner.next().map(|byte| {
                let (msb, lsb) = Nibble::from_u8(byte);
                self.rest = Some(msb);
                lsb
            })
        })
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_nibble_get() {
        let val = 0x79;
        let (msb, lsb) = Nibble::from_u8(&val);
        assert_eq!(msb.get(), 7);
        assert_eq!(lsb.get(), 9);
    }

    #[test]
    fn test_nibbler() {
        let t = [0x12, 0x34, 0x56, 0x78];
        for (i, nibble) in Nibbler::new(t.iter()).enumerate() {
            match i {
                0 => assert_eq!(nibble.get(), 2),
                1 => assert_eq!(nibble.get(), 1),
                2 => assert_eq!(nibble.get(), 4),
                3 => assert_eq!(nibble.get(), 3),
                4 => assert_eq!(nibble.get(), 6),
                5 => assert_eq!(nibble.get(), 5),
                6 => assert_eq!(nibble.get(), 8),
                7 => assert_eq!(nibble.get(), 7),
                _ => {}
            }
        }
    }

    // #[test]
    // fn test_nibbler_mut() {
    //     let t = [0x12, 0x34, 0x56, 0x78];
    //     for (i, nibble) in Nibbler::new(t.iter_mut()).enumerate() {
    //         match i {
    //             0 => assert_eq!(nibble.get(), 2),
    //             1 => assert_eq!(nibble.get(), 1),
    //             2 => assert_eq!(nibble.get(), 4),
    //             3 => assert_eq!(nibble.get(), 3),
    //             4 => assert_eq!(nibble.get(), 6),
    //             5 => assert_eq!(nibble.get(), 5),
    //             6 => assert_eq!(nibble.get(), 8),
    //             7 => assert_eq!(nibble.get(), 7),
    //             _ => {}
    //         }
    //     }
    // }
}

As requested by @chayim-friedman, here's my attempt with Borrow:

use std::borrow::Borrow;

impl<'a, I: Borrow<u8> + 'a, T: Iterator<Item = I>> Iterator for Nibbler<'a, T> {
    type Item = Nibble<'a>;

    fn next(&mut self) -> Option<Self::Item> {
        self.rest.take().or_else(|| {
            self.inner.next().map(|byte| {
                let (msb, lsb) = Nibble::from_u8(byte.borrow());
                self.rest = Some(msb);
                lsb
            })
        })
    }
}

which errors with

error[E0515]: cannot return value referencing function parameter `byte`
  --> src/utils/nibbler2.rs:42:17
   |
40 |                 let (msb, lsb) = Nibble::from_u8(byte.borrow());
   |                                                  ------------- `byte` is borrowed here
41 |                 self.rest = Some(msb);
42 |                 lsb
   |                 ^^^ returns a value referencing data owned by the current function
Mathijs Kwik
  • 1,227
  • 9
  • 12

1 Answers1

1

After struggling with this for a while, I finally found the solution in this answer:

pub enum Nibble<'a> {
    MSB(&'a u8),
    LSB(&'a u8),
}

impl Nibble<'_> {
    pub fn from_u8(input: &u8) -> (Nibble, Nibble) {
        let msb = Nibble::MSB(input);
        let lsb = Nibble::LSB(input);
        (msb, lsb)
    }

    pub fn get(&self) -> u8 {
        match self {
            Nibble::MSB(r) => (**r & 0b11110000) >> 4,
            Nibble::LSB(r) => **r & 0b00001111,
        }
    }
}

pub struct Nibbler<'a, T> {
    rest: Option<Nibble<'a>>,
    inner: T,
}

impl<T> Nibbler<'_, T> {
    pub fn new(inner: T) -> Self {
        Nibbler { inner, rest: None }
    }
}

impl<'a, T> Iterator for Nibbler<'a, T>
where
    T: Iterator,
    <T as Iterator>::Item: IntoNibbleRef<'a>,
{
    type Item = Nibble<'a>;

    fn next(&mut self) -> Option<Self::Item> {
        self.rest.take().or_else(|| {
            self.inner.next().map(|byte| {
                let (msb, lsb) = Nibble::from_u8(byte.into_nibble_ref());
                self.rest = Some(msb);
                lsb
            })
        })
    }
}

trait IntoNibbleRef<'a> {
    fn into_nibble_ref(self) -> &'a u8;
}

impl<'a> IntoNibbleRef<'a> for &'a u8 {
    fn into_nibble_ref(self) -> &'a u8 {
        self
    }
}

impl<'a> IntoNibbleRef<'a> for &'a mut u8 {
    fn into_nibble_ref(self) -> &'a u8 {
        self
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_nibble_get() {
        let val = 0x79;
        let (msb, lsb) = Nibble::from_u8(&val);
        assert_eq!(msb.get(), 7);
        assert_eq!(lsb.get(), 9);
    }

    #[test]
    fn test_nibbler() {
        let t = [0x12, 0x34, 0x56, 0x78];
        for (i, nibble) in Nibbler::new(t.iter()).enumerate() {
            match i {
                0 => assert_eq!(nibble.get(), 2),
                1 => assert_eq!(nibble.get(), 1),
                2 => assert_eq!(nibble.get(), 4),
                3 => assert_eq!(nibble.get(), 3),
                4 => assert_eq!(nibble.get(), 6),
                5 => assert_eq!(nibble.get(), 5),
                6 => assert_eq!(nibble.get(), 8),
                7 => assert_eq!(nibble.get(), 7),
                _ => {}
            }
        }
    }

    #[test]
    fn test_nibbler_mut() {
        let mut t = [0x12, 0x34, 0x56, 0x78];
        for (i, nibble) in Nibbler::new(t.iter_mut()).enumerate() {
            match i {
                0 => assert_eq!(nibble.get(), 2),
                1 => assert_eq!(nibble.get(), 1),
                2 => assert_eq!(nibble.get(), 4),
                3 => assert_eq!(nibble.get(), 3),
                4 => assert_eq!(nibble.get(), 6),
                5 => assert_eq!(nibble.get(), 5),
                6 => assert_eq!(nibble.get(), 8),
                7 => assert_eq!(nibble.get(), 7),
                _ => {}
            }
        }
    }
}

You need to introduce another nested trait that can convert both &u8 and &mut u8 into &u8, here called IntoNibbleRef.


After a little more experimenting, I realized you can also implement such a trait generically:

impl<'a, T> Iterator for Nibbler<'a, T>
where
    T: Iterator,
    <T as Iterator>::Item: IntoImmutableRef<'a, u8>,
{
    type Item = Nibble<'a>;

    fn next(&mut self) -> Option<Self::Item> {
        self.rest.take().or_else(|| {
            self.inner.next().map(|byte| {
                let (msb, lsb) = Nibble::from_u8(byte.into_immutable_ref());
                self.rest = Some(msb);
                lsb
            })
        })
    }
}
trait IntoImmutableRef<'a, T> {
    fn into_immutable_ref(self) -> &'a T;
}

impl<'a, T> IntoImmutableRef<'a, T> for &'a T {
    fn into_immutable_ref(self) -> &'a T {
        self
    }
}

impl<'a, T> IntoImmutableRef<'a, T> for &'a mut T {
    fn into_immutable_ref(self) -> &'a T {
        self
    }
}
Finomnis
  • 18,094
  • 1
  • 20
  • 27
  • 1
    Great. Thanks for figuring this out. I hope it was a nice struggle :) Looking over your generic version, I'm kind of left wondering why such a trait isn't part of the standard library. This seems pretty basic and only uses built-in constructs. I guess we could even argue that this case should just have been caught by deref semantics. In most places a mut ref will happily be passed to places where normal references are expected. Anyway, thanks for this solution, it taught me a bit about these edge cases of the type system. – Mathijs Kwik Jun 07 '22 at 19:18
  • It almost exists. There is [`Borrow`](https://doc.rust-lang.org/std/borrow/trait.Borrow.html) which contains this case, but it doesn't *consume* the original object (`borrow(&self)` vs my `into_immutable_ref(self)`). Without taking ownership, the ownership of the generic would drop inside the `next().map(...)`. Then there is [`Deref`](https://doc.rust-lang.org/std/ops/trait.Deref.html) which covers this case as well, but also doesn't consume. – Finomnis Jun 08 '22 at 00:37
  • Then, there is [`CoerceUnsized`](https://doc.rust-lang.org/std/ops/trait.CoerceUnsized.html), which also has something in that direction, but it's still unstable and I couldn't get it to work. It told me `&u8: CoerceUnsized<&u8> is not implemented`, which I'm not 100% sure how to interpret. The `CoerceUnsized` docu specifies that `&'a U : CoerceUnsized &'b V` is implemented if `U: Unsize`. So my guess is that there is no `T: Unsize`? – Finomnis Jun 08 '22 at 00:41
  • I think your problem is such a corner case that is probably better to keep such a trait in the user's library – Finomnis Jun 08 '22 at 06:57
  • After thinking about it more, this case will never be handleable by `Deref`, because `Deref` can dereference an `Arc`. And an `Iter>` will never be convertible to a `Vec<&u8>` because the vec does not take ownership – Finomnis Jun 08 '22 at 07:30
  • 1
    The other thing that makes IntoImmutableRef interesting is that it passes along the lifetime. It can do that by consuming the thing that had the lifetime. For borrow and deref, that creates the issue, because without consuming the original reference, the newly-created reference cannot escape its scope. So in short, it's really interesting and develops a better understanding of the whole borrowing and "casting" model. – Mathijs Kwik Jun 08 '22 at 14:01