What is happening is structure padding.
It is done by the compiler to ensure that elements reside at aligned memory addresses.
You can read more about 'alignment' in x86/x86_64 here.
Now, why should they be at aligned addresses? (using 4 byte WORD as example): Machines access data from memory in 'words'. For a 4 byte WORD, this means that to read a single byte from the address b11001110
, you will need to read 4 bytes (the last 2 bits in the address are basically, ignored while doing the read), then pick the byte you need once the WORD is in the CPU:
| b11001100 | b11001101 | b11001110 | b11001111 | <- all four loaded at once
\ /
only one used
When you start reading bigger data types, then reading an 'unaligned' datum can cost more than reading an aligned one:
If you wanted to read 4 bytes (1 WORD) starting at the address b01110
, instead of just one byte, then you would have to read 2 WORDS:
first load second load
/ \/ \
|01100|01101|01110|01111|10000|10001|10010|10011|
\ /
unaligned data read
The compiler 'pads' structures in order to avoid such reads. Because they are costly. As Woodrow Douglass suggests in their answer, you can force the compiler to 'pack' instead of 'pad'.
One more thing: There are architectures where unaligned loads are not even possible. On such machines, the operating system usually catches exceptions raised during an unaligned load, and then simulates the load in some way (i.e. by doing multiple aligned loads, for example).