2

Apologies if this is a duplicate, in which case I couldn't find the right keywords to search.

This is in reference to some old (MUD) code I'm working on which is pasted below. I'm confused by purpose of the foo_zero and *foo = foo_zero parts of the code below. This is a pattern it uses throughout the codebase. My guess is that it's a way of initializing all of the members of foo to zero/NULL without having to explicitly set them.

typedef struct FOO {
    int buzz;
    char *bazz;
} FOO;

FOO *init_foo(void)
{
    static FOO foo_zero;
    FOO *foo;
    
    foo = malloc(sizeof(*foo));
    *foo = foo_zero; // <-- why?
    return foo;
}
M.M
  • 138,810
  • 21
  • 208
  • 365
  • 1
    Yep, this is exactly that. `static` variables are always zero-initialized, and the code provided zero-initializes `foo`. Once could use `calloc` to achieve the same result with slightly less code. – SergeyA Jun 25 '21 at 19:59
  • 2
    It would almost certainly be more efficient to just use `calloc()`, but I'm not certain offhand if that's semantically identical. – Andrew Henle Jun 25 '21 at 20:01
  • @LeeDanielCrocker, nope, this code does not set the pointer to point to `zero_foo`! – SergeyA Jun 25 '21 at 20:01
  • @AndrewHenle `calloc()` doesn't necessarily create null pointers. It fills the allocated data with zero bytes. – Barmar Jun 25 '21 at 20:02
  • 2
    @SergeyA Technically, `calloc()` won't initialize pointers to `NULL`, although in practice it does on common processors. – Barmar Jun 25 '21 at 20:03
  • @LeeDanielCrocker Well, the allocated memory has to be zeroed at runtime no matter what. And I suspect `calloc()` would do that more efficiently than `malloc()` and then a `struct` assignment. – Andrew Henle Jun 25 '21 at 20:06
  • Ahh, I see what's going on now--I misread it the first time through. Yes, assigning the structure through the pointer copies bytes over, while calloc() zeros bytes without reading any, so calloc probably is more efficient (not to mention not wasting the memory of the static) – Lee Daniel Crocker Jun 25 '21 at 20:07
  • @Barmar: Common processors do not have a null pointer. They do not care if an address is zero or something else; there is no special treatment of such. The binary representation of a null pointer is entirely an invention of the operating system and/or the C implementation. The operating system may be involved because it reserves the page at address zero to be inaccessible deliberately for the purpose of making the address 0 serve as an invalid pointer. Whether it does or not, the C implementation determines for itself what it wants the representation(s) of a null pointer to be. – Eric Postpischil Jun 25 '21 at 20:07
  • @EricPostpischil OK, replace "processor" with "implementations", the point is still valid. It's not required by the standard, but it works in practice. – Barmar Jun 25 '21 at 20:08
  • At any rate, the point is that using this static allocation technique, versus calling `calloc` (or using `memset`) are *not* semantically identical. – Steve Summit Jun 25 '21 at 20:09
  • @Barmar I do not think you are correct. https://en.cppreference.com/w/c/types/NULL - NULL is mandated to have a value of 0, so `calloc`, which has to set all the bytes to 0, will make all pointers NULL. – SergeyA Jun 25 '21 at 20:14
  • @SteveSummit I am not certain about it. I believe, they are semantically equivalent, as they perform zero initialization. – SergeyA Jun 25 '21 at 20:15
  • 1
    @SergeyA: It is not that simple, see https://stackoverflow.com/questions/2759845/why-is-address-zero-used-for-the-null-pointer and https://stackoverflow.com/questions/32136092/how-to-write-c-c-code-correctly-when-null-pointer-is-not-all-bits-zero, https://stackoverflow.com/questions/9894013/is-null-always-zero-in-c. You may write `0` in source code as a null pointer constant, but that does not imply or require that an actual null pointer consists of zero bytes. – Nate Eldredge Jun 25 '21 at 20:16
  • 1
    @SergeyA It's an obscure distinction that almost never makes a difference in practice, because all common implementations do use all-zero bytes as the representation of null pointers. But one of the first computers I used was Multics, its null pointer was all-one bits. – Barmar Jun 25 '21 at 20:20
  • @Barmar fair enough. On implementations where null pointers are converted to non-null-bytes by implementation `calloc` will behave differently from copying from static struct. I would believe, such implementations are rare to find nowadays, but I agree, there is semantically a difference. – SergeyA Jun 25 '21 at 20:22
  • @SergeyA Exactly. Which is why I said that in practice `calloc` works. – Barmar Jun 25 '21 at 20:23
  • The "efficiency" discussed in comments is a quality of implementation issue. Interesting to see how different code is generated by one popular compiler for four functions that are exactly equivalent on the target platform in use: [godbolt link](https://gcc.godbolt.org/z/EdY4q6KEj) – M.M Jun 25 '21 at 21:48

3 Answers3

2

Yes, the lines

static FOO foo_zero;

and

*foo = foo_zero;

arrange that every new instance of struct FOO allocated by init_foo() is initialized just as if someone had said

struct FOO new_foo = { 0 };

Specifically, all integer fields will be initialized to 0, all floating-point fields will be initialized to 0.0, and all pointer fields will be initialized to null pointers (aka NULL, or nullptr in C++).

This is a nice technique, because it's both simpler and, strictly speaking, more portable than other techniques.

There's a discussion percolating in the comments about the alternative possibilities of doing

foo = malloc(sizeof(*foo));
memset(foo, 0, sizeof(*foo));

or

foo = calloc(1, sizeof(*foo));

Both of these would initialize the brand-new struct FOO to all-bits-0. The subtle problem here -- which is so subtle that many programmers would not call it a problem at all -- is that it is theoretically possible for a processor and/or operating system to represent a floating-point value of 0.0, or a null pointer, with a bit pattern of something other than all-bits-0.

But if you're on such a processor, then doing

float f = 0;

or

char *p = 0;

will do the right thing, initializing the variable with the proper zero value, even if it's not all-bits-0. And for an aggregate such as struct FOO, doing

struct FOO new_foo = { 0 };

is equivalent to explicitly initializing each of its members with 0, meaning you get the proper zero value, even if that's not all-bits-0. And, finally, any time you declare a variable with static duration, as in

static FOO foo_zero;

you get an implicit initialization as if you'd said = { 0 };, and therefore the default (static) initialization, too, gives you those correct zero values no matter what.

If you're still curious about calloc's all-bits-0 guarantee, you can read a bit more about it in question 7.31 of the C FAQ list.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • Alternatively to "C FAQ" one can read [over here](https://vorpus.org/blog/why-does-calloc-exist/), on why does `calloc` exists, in the first place. – Chef Gladiator Jun 27 '21 at 19:15
1

In fact this declaration

static FOO foo_zero;

is equivalent to the following

static FOO foo_zero = { .buzz = 0, .bazz = 0 };

So in this assignment statement

*foo = foo_zero;

an object pointed to by the pointer foo is zero initialized the same way as the static variable foo_zero.

The function return a pointer to a zero initialized object.

For this simple case you could achieve almost the same effect if instead of malloc you used calloc.

FOO *init_foo(void)
{
    return calloc( 1, sizeof( struct FOO ) );;
}

But sometimes a non-trivial initialization is required. So the approach you showed has a meaning. For example

struct FOO
{
    size_t n;
    char s[10];
};

struct FOO * init_foo( void )
{
    static struct FOO default_foo = { .n = 6, .s = "Hello" };
    
    struct FOO *foo = malloc( sizeof( *foo ) );
    
    if ( foo ) *foo = default_foo;
    
    return foo;
}
Steve Summit
  • 45,437
  • 7
  • 70
  • 103
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • 3
    While it's true, for virtually all popular processors which the OP is likely to use, that `calloc` is likely to give the same result, this is *not* strictly guaranteed by the C standard. `calloc` will give all-bits-0. `static FOO foo_zero` will initialize pointer and floating-point fields as if they were given an initial value of 0, which theoretically might not be all-bits-0, on some suitably obscure platform. – Steve Summit Jun 25 '21 at 20:21
  • 2
    No, as we just agreed in the comments, semantically using `calloc` would be different. – SergeyA Jun 25 '21 at 20:23
  • @SergeyA Static variables are zero initialized. And calloc also zero initializes allocated memory. – Vlad from Moscow Jun 25 '21 at 20:25
  • 2
    @VladfromMoscow there are nuances, you can read comments under the question. Bottom line - static initialization initializes pointers to null pointers, while `calloc` initializes bits to 0. Technically speaking, null pointer != all zero bits. Credit goes @Barmar. – SergeyA Jun 25 '21 at 20:27
  • @SteveSummit According to the C Standard static variables are zero-inintialized. So your invented processors have mothing common with the C Standard. – Vlad from Moscow Jun 25 '21 at 20:27
  • 1
    @VladfromMoscow Zero initialization is *not* necessarily all-bits-0. See answers cited elsewhere. It's an obscure point with almost no practical application, but it prevents saying that `calloc` is equivalent here. – Steve Summit Jun 25 '21 at 20:28
  • @SteveSummit All the static memory is zero initialized before a process gets control. – Vlad from Moscow Jun 25 '21 at 20:33
  • 2
    Per C 2018 6.7.9 10, default initialization for a null pointer initializes it to a null pointer. The C standard does not require the object representation of a null pointer to be bytes containing zeros (it only requires that a null pointer compare unequal to any object or function). `calloc` does set the bytes of memory to zeros, so it does not produce a null pointer in C implementations in which bytes of zeros are not the representation of a null pointer. – Eric Postpischil Jun 25 '21 at 20:40
  • 1
    (The fact that `0` is a *null pointer constant* does not mean a null pointer is represented by zero; null pointer constants are converted during compilation to whatever representation the compiler uses or are otherwise optimized.) – Eric Postpischil Jun 25 '21 at 20:41
  • 1
    @EricPostpischil Initializing an object of the type _Bool by any value including a null pointer evaluates to 0 only when the initialized value is equal to 0. Neither null pointer is converted to whatever as you are saying. O give me a reference to a C compiler that satisfies the C Standard instead of saying bla...bla...bla... – Vlad from Moscow Jun 25 '21 at 20:47
  • 1
    @VladfromMoscow Some references (albeit to earlier versions of the Standard) can be found in [question 7.31](http://c-faq.com/malloc/calloc.html) of the [C FAQ list](http://c-faq.com/). Nothing about zero initialization has changed since that was written (other than that the obscure systems where zero might not be all-bits-0 have continued to get rarer). – Steve Summit Jun 25 '21 at 20:49
  • @EricPostpischil Moreover a code compiled by one compiler can be called by a code compiled by another compiler. And what will you get? Undefined behavior? – Vlad from Moscow Jun 25 '21 at 20:51
  • 1
    @VladfromMoscow: A null pointer compares equal to a null pointer constant (including `0`, which is a null pointer constant) because, in a comparison of a pointer and a null pointer constant with `==`, the null pointer constant is converted to the type of the pointer (C 2018 6.5.9 5), which produces a null pointer (6.3.2.3 3) even if the representation of a null pointer is not zero. – Eric Postpischil Jun 25 '21 at 20:59
  • 1
    @VladfromMoscow: Code compiled by one compiler can be called by another compiler if they use the same ABI. In the absence of conforming to a common ABI (which is beyond the C standard), it is not universally true that code compiled by one compiler can be successfully called by code compiler by a different compiler. Yes, this will be undefined behavior per the C standard, because the C standard only defines the behavior of a C implementation, not of interactions between multiple C implementations. – Eric Postpischil Jun 25 '21 at 21:01
  • @EricPostpischil My example has nothing common with ABI. It shows clearly that you will have undefined behavior if your compiler does not satisfy the C Standard.. That is all. – Vlad from Moscow Jun 25 '21 at 21:05
  • Typo above, the first “null” should not be there: “Per C 2018 6.7.9 10, default initialization for a null pointer initializes it to a null pointer.” should be “Per C 2018 6.7.9 10, default initialization for a pointer initializes it to a null pointer.” – Eric Postpischil Jun 25 '21 at 21:08
0

Why complicating?

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

typedef struct {
    int buzz;
    char *bazz;
} FOO;

init_foo() above actually just returns new foo made on the heap. All clean. So let's just do that:

static inline FOO * new_foo(void)
{
    return calloc(1,sizeof(FOO));
}

Usage and check:

int main (void)
{
    FOO * foo = new_foo();

   printf("buzz: %d, bazz: %s", foo->buzz, foo->bazz );

   free(foo);

    return 42;
}

Godbolt shows new struct FOO nicely empty

Program returned: 42
buzz: 0, bazz: (null)
Chef Gladiator
  • 902
  • 11
  • 23
  • 1
    As explained elsewhere, this technique is ever-so-slightly inferior. While it's not the case for any currently-popular environment, it's theoretically possible for non-integer types, including pointers and floating-point values, to have zero values which are *not* represented by the all-bits-0 bit pattern which `calloc` gives you. The `static FOO foo_zero` technique asked about in the question is designed to work properly even on such a platform.. – Steve Summit Jun 27 '21 at 18:48
  • Thanks @SteveSummit . In practice, I have never seen such a platform. Since 1991, to date. – Chef Gladiator Jun 27 '21 at 19:07