24

I was reading Game Coding Complete 4th edition. There was a topic regarding Memory alignment. In the code below the author says that first struct is really slow because it is both not bit-aligned nor byte-aligned. The second one is not bit-aligned but byte-aligned. The last one is fast because it's both. He says without pragma, compiler will align the memory itself which causes waste of memory. I couldn't really get the calculations.

This is some portion from the text:-

If the compiler were left to optimize SlowStruct by adding unused bytes, each structure would be 24 bytes instead of just 14. Seven extra bytes are padded after the first char variable, and the remaining bytes are added at the end. This ensures that the entire structure always starts on an 8-byte boundary. That’s about 40 percent of wasted space, all due to a careless ordering of member variables.

This is the concluding line in bolds:-

Don’t let the compiler waste precious memory space. Put some of your brain cells to work and align your own member variables.

Please show me calculations and explain the padding concept more clearly.

Code:-

#pragma pack(push, 1)
struct ReallySlowStruct
{
    char c : 6;
    __int64 d : 64;
    int b : 32;
    char a : 8;
};

struct SlowStruct
{
    char c;
    __int64 d;
    int b;
    char a;
};

struct FastStruct
{
   __int64 d;
   __int b;
   char a;
   char c;
   char unused[2];
};
#pragma pack(pop)
Gamal Othman
  • 93
  • 10
Sourabh Mittal
  • 261
  • 1
  • 2
  • 7
  • 5
    _it is both not bit aligned nor byte aligned_. Bit aligned? I don't get this, the minimal addressable address in C is a byte, isn't it?. – David Ranieri Jan 18 '17 at 12:48
  • 2
    related/dupe: http://stackoverflow.com/questions/5397447/struct-padding-in-c and http://stackoverflow.com/questions/6025269/data-structure-padding – NathanOliver Jan 18 '17 at 12:50
  • 1
    Also see this wiki: https://en.wikipedia.org/wiki/Data_structure_alignment – NathanOliver Jan 18 '17 at 12:51
  • @KeineLust i mistakenly said it incorrectly. It is not aligned at bit boundaries. – Sourabh Mittal Jan 18 '17 at 12:54
  • @NathanOliver I'm checking links you provided I'll be seeing if i gets any doubts remaining. Thank you all. – Sourabh Mittal Jan 18 '17 at 12:55
  • 2
    What on earth is a "bit boundary"? What happens in between bit boundaries? – Kerrek SB Jan 18 '17 at 12:58
  • @Kerrek SB Sir i'm new to all this. I'm just quoting what is written in book. Author has written following in this regard:The first structure isn’t even aligned properly on bit boundaries, hence the name ReallySlowStruct. The definition of the 6-bit char variable throws the entire structure out of alignment. The second structure, SlowStruct, is also out of alignment, but at least the byte boundaries are aligned. – Sourabh Mittal Jan 18 '17 at 13:04
  • The fast structure doesn't hold the same information as the slow structures; the comparison isn't fair. And using bit-fields is a good way to slow anything down – Jonathan Leffler Jan 18 '17 at 13:28
  • I'm having a hard time to roll my head around this topic. I need more time to understand memory alignment. If anyone can explain this example in detail it would be very helpful. – Sourabh Mittal Jan 18 '17 at 13:42
  • From the paraphrases, the book seems a bit glib. The layout of bitfields (i.e., that first struct) is implementation-specific, so statements about the consequences of using bitfields that don't refer both to a particular compiler and to the settings used to compile the code aren't useful. – Pete Becker Jan 18 '17 at 13:46
  • @PeteBecker i agree with you. I'm just newbie kind of guy here but this book is too good till now. It covers the most important aspects of game programming which is also one of the most difficult fields. Now i think i'm cleared on the concept. Nathan's links are valuable and i'm marking those excellent links. Thank you all for your support. – Sourabh Mittal Jan 18 '17 at 13:50
  • The author uses a compiler extension to tell the compiler to produce poorly aligned structures by preventing padding bytes and then inserts padding bytes manually. I hope this is only done in the name of telling you about padding and alignment and this is not what the author would actually do. You should treat it as such. – nwp Jan 18 '17 at 15:06
  • @nwp I agree with you partially. I ran and tested code with sizeof operators on DevC++ and the code ran as expected and statements provided in the book are completely correct. I don't think he used some compiler configuration. I used TDM GCC 4.9.2 and everything is working perfectly fine. For further note you can refer to the book. – Sourabh Mittal Jan 18 '17 at 15:15
  • Sorry everyone. I commited mistake in original question. I forgot to add 64 int var in FastStruct. I apologize. Please correct any statements regarding topic. – Sourabh Mittal Jan 18 '17 at 15:38

1 Answers1

38

The examples given in the book are highly dependent on the used compiler and computer architecture. If you test them in your own program you may get totally different results than the author. I will assume a 64-bit architecture, because the author does also, from what I've read in the description. Lets look at the examples one by one:

ReallySlowStruct IF the used compiler supports non-byte aligned struct members, the start of "d" will be at the seventh bit of the first byte of the struct. Sounds very good for memory saving. The problem with this is, that C does not allow bit-adressing. So to save newValue to the "d" member, the compiler must do a whole lot of bit shifting operations: Save the first two bits of "newValue" in byte0, shifted 6 bits to the right. Then shift "newValue" two bits to the left and save it starting at byte 1. Byte 1 is a non-aligned memory location, that means the bulk memory transfer instructions won't work, the compiler must save every byte at a time.

SlowStruct It gets better. The compiler can get rid of all the bit-fiddling. But writing "d" will still require writing every byte at a time, because it is not aligned to the native "int" size. The native size on a 64-bit system is 8. so every memory address not divisable by 8 can only be accessed one byte at a time. And worse, if I switch off packing, I will waste a lot of memory space: every member which is followed by an int will be padded with enough bytes to let the integer start at a memory location divisable by 8. In this case: char a and c will both take up 8 bytes.

FastStruct this is aligned to the size of int on the target machine. "d" takes up 8 bytes as it should. Because the chars are all bundled at one place, the compiler does not pad them and does not waste space. chars are only 1 byte each, so we do not need to pad them. The complete structure adds up to an overall size of 16 bytes. Divisable by 8, so no padding needed.


In most scenarios, you never have to be concerned with alignment because the default alignment is already optimal. In some cases however, you can achieve significant performance improvements, or memory savings, by specifying a custom alignment for your data stuctures.

In terms of memory space, the compiler pads the structure in a way that naturally aligns each element of the structure.

struct x_
{
   char a;     // 1 byte
   int b;      // 4 bytes
   short c;    // 2 bytes
   char d;     // 1 byte
} bar[3];

struct x_ is padded by the compiler and thus becomes:

// Shows the actual memory layout
struct x_
{
   char a;           // 1 byte
   char _pad0[3];    // padding to put 'b' on 4-byte boundary
   int b;            // 4 bytes
   short c;          // 2 bytes
   char d;           // 1 byte
   char _pad1[1];    // padding to make sizeof(x_) multiple of 4
} bar[3];

Source: https://learn.microsoft.com/en-us/cpp/cpp/alignment-cpp-declarations?view=vs-2019

avgcoder
  • 372
  • 1
  • 9
  • 27
jwsc
  • 867
  • 8
  • 15
  • 1
    Thank you for answering my question. Just now i ran some code from different links and that given in book and ran experiments, I'm cleared on the topic that there is no particular standard decided upon padding. But author implies to increase performance get to know you compiler padding rules and order declarations in proper manner. Your answer is also extremly helpful. Thank you. – Sourabh Mittal Jan 18 '17 at 15:12
  • I commited mistake in FastStruct code. I forgot to add 64 int var. I'm sorry. Please update the answer accordingly. – Sourabh Mittal Jan 18 '17 at 15:37
  • Thank you for correcting answer. I appreciate your help and now I'm confident that i've understood the topic. Thanks again. – Sourabh Mittal Jan 18 '17 at 18:55
  • 3
    In your explanation of FastStruct, you talk about "b" taking up 8 bytes. Shouldn't that be "d"? – Victor Stone Apr 09 '19 at 17:27