2

Possible Duplicate:
struct sizeof result not expected
Struct varies in memory size?

Here is the code compiled on Ubuntu Server 11.10 for i386 machine:

// sizeof.c
#include <stdio.h>
#include <malloc.h>


int main(int argc, char** argv){
        printf("int's size: %d bytes\n", sizeof(int));
        printf("double's size: %d bytes\n", sizeof(double));
        printf("char's size: %d bytes\n", sizeof(char));
        printf("\n");

        printf("char pointer's size: %d\n", sizeof(char *));
        printf("\n");

        struct Stu{
                int id;
                char* name;
                char grade;
                char sex;
        //      double score;
        };
        printf("struct Stu's pointer's size : %d\n",sizeof(struct Stu *));

        struct Stu stu;
        stu.id=5;
        stu.name="Archer";
        stu.grade='A';
        stu.sex='M';
        printf("Stu(int,char*, char,char)'s size: %d bytes\n", sizeof(struct Stu));
        printf("Stu(5,\"Archer\",'A','M')'s size: %d bytes\n",sizeof(stu));
}

compile:

`gcc -o sizeof sizeof.c`

output:

int's size: 4 bytes
double's size: 8 bytes
char's size: 1 bytes

char pointer's size: 4

struct Stu's pointer's size : 4
Stu(int,char*, char,char)'s size: 12 bytes
Stu(5,"Archer",'A','M')'s size: 12 bytes

My question is why the size of struct Stu is 12, not sizeof(int) + sizeof(char *) + sizeof(char) + sizeof(char) = 4 + 4 + 1 + 1 = 10. When you put a double member intostruct Stu,sizeof(struct Stu)` will be 20.

Community
  • 1
  • 1
爱国者
  • 4,298
  • 9
  • 47
  • 66
  • 2
    http://en.wikipedia.org/wiki/Data_structure_alignment – Mysticial Jan 24 '12 at 07:07
  • Padding to align the data members on aligned boundaries. – Cody Gray - on strike Jan 24 '12 at 07:09
  • 3
    possible duplicate of [struct sizeof result not expected](http://stackoverflow.com/questions/1913842/struct-sizeof-result-not-expected), [Use of struct padding](http://stackoverflow.com/questions/4587470/use-of-struct-padding), [Data structure padding](http://stackoverflow.com/questions/6025269/data-structure-padding), [Struct varies in memory size?](http://stackoverflow.com/questions/6800884/struct-varies-in-memory-size), [C struct sizes inconsistence](http://stackoverflow.com/questions/8539348/c-struct-sizes-inconsistence) – Cody Gray - on strike Jan 24 '12 at 07:10
  • 1
    These comments are correct, the answers (so far) are poor. @math's answer is not relevant for this question. @Cody's comment is absolutely correct, as is @Mysticial's reference, albiet more terse. There are usually compiler pragmas that will force the struct to pack into the smallest space, but unless you use those the compiler optimizes for word-level access (even multiples of the CPU's word size) and so each `char` is in a word of its own, typically a 4 byte block. – Cyberfox Jan 24 '12 at 09:34
  • @Cyberfox: Yes, the answers are poor because the question is a duplicate. It's standard practice here not to duplicate answers in multiple places. – Cody Gray - on strike Jan 24 '12 at 18:35
  • @Cody: Ah! Interesting; I've been confused why some folks answer in the comments, instead of in full answers. It's an etiquette thing, then. Now I know; thank you! – Cyberfox Jan 24 '12 at 19:42
  • @Cyberfox: There are a couple of reasons. Either because they don't have time or a keyboard available to provide (what they think counts as) a complete answer, so they'll just post a hint or a tip. Or because they're also voting to close the question as a duplicate, and yes it's customary not to answer duplicate questions. If you want to write a good answer (and that's always encouraged!) post it on the linked duplicate. – Cody Gray - on strike Jan 24 '12 at 20:08

2 Answers2

4

To calculate the sizes of user-defined types, the compiler takes into account any alignment space needed for complex user-defined data structures. This is why the size of a structure in C can be greater than the sum of the sizes of its members. For example, on many systems, the following code will print 8:

refer http://en.wikipedia.org/wiki/Sizeof

Suppose you have the following structure:

struct A1
{
  char a[2];
  int b;
};

You could think that sizeof(A1) equates to 6, but it doesn’t. It equates to 8. The compiler inserts 2 dummy bytes between members ‘a’ and ‘b’.

The reason is that the compiler will align member variables to a multiple of the pack size or a multiple of the type size, whichever is smallest.

The default pack size in visual studio is 8 bytes.

‘b’ is of the integer type, which is 4 bytes wide. ‘b’ will be aligned to the minimum of those 2, which is 4 bytes. It doesn’t matter if ‘a’ is 1, 2, 3 or 4 bytes wide. ‘b’ will always be aligned on the same address.

refer for more details

Hemant Metalia
  • 29,730
  • 18
  • 72
  • 91
1

Data has to be properly aligned for decent efficiency, so the compiler is at liberty to add padding to the interior of a structure (anywhere except at the start).

Generally, an N-byte type (for 1, 2, 4, 8, 16 bytes) is aligned on an N-byte address boundary.

Therefore, for a 32-bit compilation, the structure has:

    int id;          // offset =  0, size = 4
    char* name;      // offset =  4, size = 4
    char grade;      // offset =  8, size = 1
    char sex;        // offset =  9, size = 1
    double score;    // offset = 16, size = 8

For a total size of 24. Note that even if you moved the double around - say to the front of the structure, or to after the name, the size would still be 24 because all the elements of an array of the structure must be properly aligned, so there will be at least 6 bytes padding. (Sometimes, a double only needs to be aligned on a 4-byte boundary; the padding would then be 2 bytes instead of 6.)

Even without the double member, the structure must be 12 bytes long so that the id is properly aligned for the extra elements in an array - there would be 2 bytes of padding.

Some compilers provide programmers with a rope called #pragma pack or thereabouts, and some programmers leap at the opportunity to hang themselves with the rope thus provided.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Why the negativity w.r.t. `#pragma pack`? :) It's almost essential if you are dealing with on-disk or on-network representations of data. – Aidan Steele Jan 24 '12 at 07:26
  • 2
    Because it makes the program run slower, and requires care to ensure that the pragma is used consistently. Generally speaking, at least for the work I do, it is simpler/better to make sure the structure layouts are as compact as possible an live with any padding than to try squeezing the utmost in space out of the system at the cost of extra time accessing the misaligned data in a structure. I've never found it necessary to use `#pragma pack`; I don't expect to do so. But I've only been coding in C for a quarter century (and only in a limited range of programs), so maybe it matters for others. – Jonathan Leffler Jan 24 '12 at 07:48
  • I suppose issues won't arise as long as you use the same revision of the same compiler on the same target architecture. What do you do when you're sending data to/from machines that have different memory word sizes and unaligned-access rules? – Aidan Steele Jan 24 '12 at 08:01
  • 2
    I serialize properly. I deal all the time with data that is shipped between SPARC (big-endian) and PPC (ditto) and Intel (little-endian). There is work involved, but only when shipping the data, not when accessing the structure after it has arrived (or before it is ready to be sent). `#pragma pack` would be no help across those machines; it is non-portable and need not do the same thing on the different compilers on the different machines (or even different compilers on a single machine). – Jonathan Leffler Jan 24 '12 at 08:24
  • `#pragma pack` was mandatory when building a C simulation of an embedded processor which had packed output formats for pixel values plus ancillary information. And that's about the only time it's ever been important for me to use a packing pragma, to emulate specific hardware output. I agree that it should be avoided unless you're in equivalently dire/specific circumstances. – Cyberfox Jan 24 '12 at 09:41
  • @Cyberfox: if you're emulating an embedded processor or something similar, using `#pragma pack` probably makes sense. But I suspect source code portability was of less importance than fidelity to the detailed layout of the data of the simulated processor. – Jonathan Leffler Jan 24 '12 at 14:49
  • I'm not sure what relevance endianness has w.r.t. data structure alignment? I was thinking more along the lines of communicating between an 8051 and ARM or DSP of your choice. I've found `#pragma pack` very useful for these purposes and have yet to run across any issues. Perhaps I've just been lucky. – Aidan Steele Jan 24 '12 at 21:20
  • If you transfer data between different types of machine, endian-ness matters. While you stick to one machine, you can get away with blue murder. Once you move to other machines, you have to worry about sizes and endian-ness, and maybe about packing too. Sizes and endian-ness are more of an issue than packing in my experience. Maybe you've been lucky; maybe you only deal in single-byte character strings (where these are not issues). I doubt if it is 'strings only', though. Off-hand, I don't know if those are big-endian or little-endian systems or a mixture. – Jonathan Leffler Jan 24 '12 at 21:46