7

I asked a question earlier on defining a structure using malloc. This was the answer I was given by the majority:

struct retValue* st = malloc(sizeof(*st));

I was showing a friend my code, and we came to a stumbling block. Could someone please explain why this code works? From my viewpoint, *st hasn't been defined when you malloc it, so there could be any kind of garbage in there. It should be malloc(sizeof(struct retValue))

Thanks for any help

Community
  • 1
  • 1
Blackbinary
  • 3,936
  • 18
  • 49
  • 62
  • You answered your own question. sizeof(struct retValue) is correct – Sam Post Feb 01 '10 at 17:03
  • 1
    Sorry, the question was not 'Is this correct, or what is correct?' it was 'Why does this work?' – Blackbinary Feb 01 '10 at 17:15
  • 2
    It might help understanding if you change the terminology. You aren't *defining* a structure using malloc, you're *allocating* a structure using malloc. You're defining `st`, which is a pointer (not a struct). That pointer is already defined and available for use in the initialiser expression (on the RHS of the equals sign), it just doesn't have a value, so most uses would be invalid. This one's OK, though, because sizeof doesn't use the value. – Steve Jessop Feb 01 '10 at 18:19

4 Answers4

19

The sizeof operator doesn't actually evaluate its operand - it just looks at its type. This is done at compile time rather than runtime. So it can safely be performed before the variable has been assigned.

interjay
  • 107,303
  • 21
  • 270
  • 254
  • I think i understand. So your saying `sizeof` looks at *st and says 'Oh thats a pointer!' and then allocates enough memory for a pointer. It doesn't care what *st is actually holding. Right? – Blackbinary Feb 01 '10 at 17:03
  • 5
    No, it's the other way around. `*st` means "the thing that st points to", so the compiler returns the size of the structure, not the size of the pointer. – Graeme Perrow Feb 01 '10 at 17:05
  • 3
    @Blackbinary: Close: `st` is a pointer, but `*st` is a struct. So it looks at `*st` and says "Oh that's a `struct retValue`!" and then allocates enough memory for a retValue structure. The actual contents of `*st` don't matter. – interjay Feb 01 '10 at 17:06
  • Ah, so close. thanks. Must've just got confused as I typed because I was thinking struct in my head! – Blackbinary Feb 01 '10 at 17:10
  • sizeof looks at the type of the expression. st is of type "pointer to struct retValue". \*st is of type "struct retValue" (* is the pointer dereferencing operator). Therefore, the type of the expression is "struct retValue" and it is the size of that type which the compiler will evaluate the sizeof-statement to. – VoidPointer Feb 01 '10 at 17:38
  • 1
    Argument evaluation has nothing to do with this. – dirkgently Feb 01 '10 at 23:10
19

Sizeof looks at the type of the expression given to it, it does not evaluate the expression. Thus, you only need to make sure that the variables used in the expression are declared so that the compiler can deduce their type.

In your example, st is already declared as pointer-to-struct-retValue. Consequently the compiler is able to deduce the type of the expression "*st".

Although it doesn't look like it is already declared in your code, the compiler has already taken care of it for you. All declarations in your code are moved to the beginning of the block in which they occur by the compiler. Suppose you write

One way to illustrate the knowledge that is available to the compiler is to look at the intermediate output it generates. Consider this example code...

struct retValue {long int a, long int b};
...
printf("Hello World!\n");
struct retValue* st = malloc(sizeof(*st));

Using gcc as an example and teh above code in the main() function of test.c, let's look at the intermediate output by running...

gcc -fdump-tree-cfg test.c

The compiler will generate the file test.c.022t.cfg - Look at it and you'll see

[ ... removed internal stuff ...]
;; Function main (main)

Merging blocks 2 and 3
main (argc, argv)
{
  struct retValue * st;
  int D.3097;
  void * D.3096;

  # BLOCK 2
  # PRED: ENTRY (fallthru)
  __builtin_puts (&"Hello World!"[0]);
  D.3096 = malloc (16);
  st = (struct retValue *) D.3096;
  D.3097 = 0;
  return D.3097;
  # SUCC: EXIT

}

Note how the declaration was moved to the beginning of the block and the argument to malloc is already replaced with the actual value denoting the size of the type the expression evaluated to. As pointed out in the comments, the fact that the declaration was moved to the top of the block is an implementation detail of the compiler. However, the fact that the compiler is able to do this and also to insert the correct size into the malloc all shows that the compiler was able to deduce the necessary information from the input.

I personally prefer to give the actual type name as a parameter to sizeof, but that is probably a question of coding-style where I'd say that consistency trumps personal-preference.

VoidPointer
  • 17,651
  • 15
  • 54
  • 58
  • 1
    His question was about `*st` doing "bad things". `*st` is invalid if `st` doesn't point to valid data. But since `sizeof` doesn't evaluate its argument, it's okay to use `*st` as an operand to `sizeof`, even if `st` is `NULL` for example. – Alok Singhal Feb 01 '10 at 17:17
  • Actually, the question was "how/why does it work despite the fact that st cannot be dereferenced at the time malloc needs to know how much space to allocate"... – VoidPointer Feb 01 '10 at 17:33
  • 1
    Putting a declaration at the beginning of the block in the parsed version of the code is an implementation detail. The compiler isn't required to do it, even though yours happens to, and the questioner's code works whether the compiler does it or not. The following code doesn't compile, showing that the declaration isn't "really" moved: `sizeof(*st); char *st = 0;`. – Steve Jessop Feb 01 '10 at 18:14
  • In addition to what Steve Jessop said, let me add that it isn't code rearrangement that determines if the `sizeof` works but rather a declaration/definition _in scope_. Your answer ascribes standard behavior incorrectly to an implementation detail. – dirkgently Feb 01 '10 at 23:09
  • I have rephrased the answer so that it no longer emphasizes the reordering as the reason but only as an example for what the compiler is able to deduce from the input. Thanks for your feedback guys. – VoidPointer Feb 02 '10 at 10:32
1

What matters is the declaration/definition of the structure type and not the definition of an object of such a class. By the time you reach the malloc, a declaration/definition will have been encountered by the compiler, you'd hit a compiler error otherwise.

The fact that sizeof does not evaluate its operands is a side-issue.

A minor nit: remember that we need parentheses when we supply type names to sizeof as in:

sizeof(struct retValue);

and not in case of objects, we simply do:

sizeof *st;

See the standard:

6.5.3 Unary operators Syntax

unary-expression:
[...]
sizeof unary-expression
sizeof ( type-name )
dirkgently
  • 108,024
  • 16
  • 131
  • 187
0

In C, sizeof is an operator, and doesn't evaluate its argument. That can lead to "interesting" effects, that someone new to C does not necessarily anticipate. I mentioned that in more detail in my answer to the "Strangest language feature" question.

Community
  • 1
  • 1
Alok Singhal
  • 93,253
  • 21
  • 125
  • 158