3

I ran an experiment on cpp.sh (no special flags) where the word size seems to be 4 bytes. In my experiment, I initialized two elements of type Data, a struct with just a single char, on the stack and printed out their addresses. I did the same with two variables of type char.

#include <iostream>
#include <bitset>

struct Data {
    char c;    
};

void PrintAddr(const void* ptr) {
    std::cout << std::bitset<32>((unsigned int) ptr) << std::endl;
}

int main()
{
    std::cout << "word size = " << sizeof(size_t) << std::endl;
    std::cout << "sizeof = " << sizeof(Data) << std::endl;
    std::cout << "alignof = " << alignof(Data) << std::endl;
    
    std::cout << "Data addresses: " << std::endl;

    Data a, b;
    PrintAddr(&a);
    PrintAddr(&b);
 
    std::cout << "char addresses: " << std::endl;
 
    char c, d;
    PrintAddr(&c);
    PrintAddr(&d);
}

Output:

word size = 4
sizeof = 1
alignof = 1
Data addresses: 
00000000010100000101001011101000
00000000010100000101001011100000
char addresses: 
00000000010100000101001011011111
00000000010100000101001011011110

It seems like padding is being added for variables a and b, of type Data, while there is none being added for type c and d. Why is this the case?

user17732522
  • 53,019
  • 2
  • 56
  • 105
  • 1
    Please don't tag `c` when asking a question about C++. These are two distinct languages. – user17732522 May 21 '23 at 22:29
  • 3
    Just my 2p, but printing the addresses in binary makes this more confusing, not less. – pmacfarlane May 21 '23 at 22:33
  • 1
    Can reproduce with Clang (but not GCC), regardless of optimization level: https://godbolt.org/z/ex9Y3qPrq – user17732522 May 21 '23 at 22:35
  • 1
    This doesn’t adding the question, but none of the output in this program needs the extra stuff that `std::endl` does. `’\n’` ends a line. – Pete Becker May 21 '23 at 22:36
  • 1
    Values of type object have alignment requirements that are implementation-defined. You don't have a reason for expecting the same alignment of a `char` and a `struct`. – user207421 May 22 '23 at 02:13
  • @user207421 But as OP is showing: The alignment requirement of `Data` is just `1` on the implementation. Yet, it is still aligning stricter on the stack. – user17732522 May 22 '23 at 03:30
  • What platform are you compiling for, with what compiler and options? Struct alignment and layout rules are ABI-specific. (With MSVC for Windows x64 being quite surprising, not what you'd expect from `alignof`; see [Why is the "alignment" the same on 32-bit and 64-bit systems?](https://stackoverflow.com/q/55920103)). And implementations that choose to align more than the ABI's minimum requirement are also a thing, when the ABI doesn't force the layout. e.g. GCC and clang will align `double`s by 8 when they can, even for 32-bit x86 where `alignof(double)==4`, half their sizeof. – Peter Cordes May 22 '23 at 04:39
  • @user17732522: I was playing around with this: clang does actually reserve stack space for each unused object when you print the address. It actually aligns them each by 8 (easier to see with hex output). https://godbolt.org/z/dq6oaMxE1 But they're not taking up the full 8 bytes: the two `char` objects are at ...b3 and ...b2, only 2 bytes past a struct at ...b0. Anyway, this lack of packing of tiny structs is a minor missed optimization, wasting some stack space in this case where there's nothing in them worth aligning. – Peter Cordes May 22 '23 at 04:59

1 Answers1

5

A pedantic answer might be that the C++ language specification gives no guarantees of what the addresses of local variables might be. They might be next to each other, or there might be padding, or they might be completely unrelated! A language-lawyer might be happy to leave it at that.

If you're asking why a specific compiler does that, you could amend your question (or add tags) to specify that.

Note that these automatic variables probably wouldn't even have addresses until you actually take their address - they'd just live in registers. Probably more so for the char variables.

So that's my guess - the compiler you use is happy to pack automatic char variables on the stack (when you take their addresses), but is reluctant to pack automatic struct variables the same way.

pmacfarlane
  • 3,057
  • 1
  • 7
  • 24