Investigating with the Visual Studio Machine Code Window
For byte enum
and int enum
we can see that the memory is aligned to 4 bytes
in each case on a Windows x64, even using the Flags
attribute. But long enum
consume 8 bytes
per value.
It is therefore unnecessary to use byte enum because it consumes the same memory and the speed is the same and using byte alignment should be only useful on 8-bit machines.
Explanation
It is because .NET use 32-bit registers for integers on x32 as well as on x64 Intel/AMD even with AnyCPU or x64 target.
The .NET compiler doesn't optimize for memory but for speed, because of the processor internal bus size that is optimized (16/)32-bit on x32 CPU or (32/)64 bits on x64 CPU. So 32-bit access is the best speed on x32 and 64-bit is the best on x64. Any other size "force" the processor to down/up size. I wrote (16/) and (32/) because these access are optimized to be less slow than others down/up sizing.
This is because the .NET compiler favors speed over memory, especially since it is a virtual machine, and therefore slower than native code. And it favors 32-bit for compatibility reasons. Indeed, 32-bit registers and memory access are more optimized on x32/x64 machines, and a lower alignment is a loss of performance (unless we have an 8 or 16-bit processor). Only memory pointers are 32 on x32, and 64 on x64 if compiled AnyCPU or targetting x64.
So any data shorter than 32-bits should be aligned to 32-bits nowaday, and it could be 64 in the future.
Thus enums shorter than integers as well as integral values less or equal than 4 bytes size in structs and classes are 4 bytes aligned, so we lost 3 bytes per byte on a .NET Machine under x32 or x64, unless we modify the struct packing size (default is 4 bytes).
Remark
A marshal sizeof on a struct or a class instance gives the "used (unmanaged) data size" but not the real reserved memory including lost blocks to store the data.
Example
enum FruitsByte : byte { Apple, Orange, Banana }
enum FruitsInt { Apple, Orange, Banana }
enum FruitsLong : long { Apple, Orange, Banana }
For byte
var fb1 = FruitsByte.Apple;
mov dword ptr [rbp+0A4h],ecx
var fb2 = FruitsByte.Orange;
mov dword ptr [rbp+0A0h],1
0xA4 - 0xA0 = 4 bytes
For int
var fi1 = FruitsInt.Apple;
mov dword ptr [rbp+9Ch],ecx
var fi2 = FruitsInt.Orange;
mov dword ptr [rbp+98h],1
0x9C - 0x98 = 4 bytes
For long
var fl1 = FruitsLong.Apple;
mov qword ptr [rbp+90h],rcx
var fl2 = FruitsLong.Orange;
mov qword ptr [rbp+88h],rcx
0x90 - 0x88 = 8 bytes
Full machine code
var fb1 = FruitsByte.Apple;
00007FF7E93C0B19 xor ecx,ecx
00007FF7E93C0B1B mov dword ptr [rbp+0A4h],ecx
var fb2 = FruitsByte.Orange;
00007FF7E93C0B21 mov dword ptr [rbp+0A0h],1
var fi1 = FruitsInt.Apple;
00007FF7E93C0B2B mov dword ptr [rbp+9Ch],ecx
var fi2 = FruitsInt.Orange;
00007FF7E93C0B31 mov dword ptr [rbp+98h],1
var fl1 = FruitsLong.Apple;
00007FF7E93C0B3B movsxd rcx,ecx
00007FF7E93C0B3E mov qword ptr [rbp+90h],rcx
var fl2 = FruitsLong.Orange;
00007FF7E93C0B45 mov ecx,1
00007FF7E93C0B4A movsxd rcx,ecx
00007FF7E93C0B4D mov qword ptr [rbp+88h],rcx
More information
Data structure alignment (Wikipedia)
x64 Architecture - CPU registers (MSDoc)
8086 to i486 Instructions Set
Intel® 64 and IA-32 Architectures Software Developer Manuals