5

It appears to be smart enough to only use one byte for A, but not smart enough to use one byte for B, even though there are only 8*8=64 possibilities. Is there any way to coax Rust to figure this out or do I have to manually implement a more compact layout?

Playground link.

#![allow(dead_code)]

enum A {
    L,
    UL,
    U,
    UR,
    R,
    DR,
    D,
    DL,
}

enum B {
    C(A, A),
}

fn main() {
    println!("{:?}", std::mem::size_of::<A>()); // prints 1
    println!("{:?}", std::mem::size_of::<B>()); // prints 2
}
Joseph Garvin
  • 20,727
  • 18
  • 94
  • 165
  • 2
    That's because rust's enums are the size of its largest variant. In this case, `A` is the size of a `u8`, and therefore there is _two_ bytes required to fit _two_ `A`s in `B`, as there is no compile-time micro optimizations like this. Anyway, what if the packed version of this was slower to use than the unpacked version? – Optimistic Peach Feb 03 '19 at 01:42
  • one word, implemented behavior. – Stargateur Feb 03 '19 at 01:57
  • @OptimisticPeach: it's certainly possible that it would be worse on some platforms/use-cases, but with memory latencies nowadays usually smaller data structures make up any unpacking time through having fewer cache misses. I am going to have fairly large vectors of these objects I'm going to be accessing semi-randomly, so cache misses are a concern for my use case. I'd be fine with something I have to opt into but that still saves me the work of manually doing the packing logic myself. – Joseph Garvin Feb 03 '19 at 02:02
  • Rust can do enum layout optimizations in some more limited cases, see https://github.com/rust-lang/rust/pull/45225 for example – the8472 Feb 03 '19 at 03:48

1 Answers1

16

Both bytes are necessary to preserve the ability to borrow struct members.

A type in Rust is not an ideal set of values: it has a data layout, which describe how the values are stored. One of the "rules" governing the language is that putting a type inside a struct or enum doesn't change its data layout: it has the same layout inside another type as it does standalone, which allows you to take references to struct members and use them interchangeably with any other reference.*

There's no way to fit two As into one byte while satisfying this constraint, because the size of A is one whole byte -- you can't address a part of a byte, even with repr(packed). The unused bits just remain unused (unless they can be repurposed to store the enum tag by niche-filling).

*Well, repr(packed) can actually make this untrue. Taking a reference to a packed field can cause undefined behavior, even in safe code!

trent
  • 25,033
  • 7
  • 51
  • 90
  • I wonder if it's possible to have some sort of macro that would make a compact representation of B, that would involve generating multiple possible representations of A and implementing conversions for you to get the best of both worlds... – Joseph Garvin Feb 03 '19 at 17:25