Disclaimer: There is no guarantee that the exact behavior seen here will always hold true. YOU SHOULD NOT RELY ON THE SPECIFIC VALUES. Also, this answer is only based on observations and will only address enums specifically; other memory optimizations exist surrounding &T
s, NonNull<T>
s, NonZeroU8
s (and family), and nested structures containing those.
Basic enums
If you have a simple no-nested-structure enum, the default behavior is that the enum variants start at zero and increment upwards. So the first non-representable bit-pattern is used as the None
value:
enum Simple { A, B };
println!("{}", unsafe { transmute::<Option<Simple>, u8>(None) });
// prints 2
If your simple no-nested-structure enum leaves a gap at the front, the None
value will still be represented by the first non-representable bit-pattern after the enum variant representations:
enum GapInFront { A = 1, B };
println!("{}", unsafe { transmute::<Option<GapInFront>, u8>(None) });
// prints 3
If you leave a gap at the front, and have a variant at the end of the bit-space, only then will it use all-zeros as the None
value:
enum ExtendsToEnd { A = 1, B = 255 };
println!("{}", unsafe { transmute::<Option<ExtendsToEnd>, u8>(None) });
// prints 0
One thing to note, it will never pick a representation between variants for the None
value. Even if there are plenty of non-representable bit-patterns, variants occupying the boundaries will cause it to use 2-bytes:
enum Full { A = 0, B = 255 };
println!("{:?}", unsafe { transmute::<Option<Full>, [u8; 2]>(None) });
// prints [0, 60], which I believe is undefined behavior
// since I think the second byte is left uninitialized
My guess is that the compiler doesn't keep track of all representable bit-patterns, and instead only keeps ranges for doing these checks.
More interesting stuff.
If your enum has a variant with a nested enum value, it will take that into consideration as well:
enum Nested { A, B };
enum Complex { A(Nested), B };
println!("{}", unsafe { transmute::<Option<Complex>, u8>(None) });
// prints 3
However, it seems to break if two variants have values, even if their bit patterns don't overlap (sad):
enum Nested1 { A, B };
enum Nested2 { A=2, B };
enum MoreComplex { A(Nested1), B(Nested2) };
println!("{:?}", unsafe { transmute::<Option<MoreComplex>, [u8; 2]>(None) });
// prints [2, 211], again the second byte is left uninitialized
Another thing to point out, this isn't a special case with Option
; if you define your own option type, it behaves the same:
enum MyOption<T> { None, Some(T) };
println!("{}", unsafe { transmute::<MyOption<Simple>, u8>(MyOption::None) });
// prints 2
See all this on the playground.
See also What is the overhead of Rust's Option type?.