-3

Can someone explain to me why *x is consistently returning 0 while *(x + 1) returns values like -805306368, 536870912?

int * x = malloc(sizeof(int) * 2);
printf("%d, %d\n", *x, *(x + 1));

I'm using gcc, but have experienced the same behavior with clang.

My understanding was that malloc would allocate enough memory on the heap for two int values. I'm assuming that *x will reference the first uninitialized int while *(x + 1) will reference the second uninitialized int.

I don't think this is a duplicate of this because *x is always 0. I have a decent grasp of why *(x + 1) is returning "garbage", but less so for why *x is so consistently 0.

wpcarro
  • 1,528
  • 10
  • 13
  • 5
    There is no 'why', you're reading uninitialized memory, there is no point in reasoning about it's content. – tkausl Sep 27 '18 at 18:13
  • I'd eventually like a way to loop through the fixed sized array and only print the values that have been initialized. I'm not sure how to accomplish this as is. – wpcarro Sep 27 '18 at 18:15
  • Why would you expect *anything* from it? There is no RNG or something which is filling uninitialized memory. The values are just some leftovers of some other stuff that was there before. – Eugene Sh. Sep 27 '18 at 18:15
  • 1
    @wpcarro There is no way to distinguish between initialized or uninitialized memory without extra information. – Eugene Sh. Sep 27 '18 at 18:16
  • 1
    There's no way to know, just by looking at a piece of memory, whether it's been explicitly initialized or contains random garbage. – 500 - Internal Server Error Sep 27 '18 at 18:16
  • Possible duplicate of [(Why) is using an uninitialized variable undefined behavior?](https://stackoverflow.com/questions/11962457/why-is-using-an-uninitialized-variable-undefined-behavior) – dandan78 Sep 27 '18 at 18:18
  • You could use `calloc` if you want to allocate and initialize (to `0`) memory at once. – bool3max Sep 27 '18 at 18:27
  • 1
    I am surprised at how many people refuse to understand the question as if their sanity depended on not understanding it. Sorry wpcarro, doesn't seem like you'll be getting a useful answer here. – Tasos Papastylianou Sep 27 '18 at 18:28
  • 2
    One possibility since you are running the same program again and again: the relic at `x[0]` was some *value*, and the relic at `x[1]` was some *address* and since the program might not always be launched at the **same** address, the first is consistent and the second is not. One way to find out if you have set specific elements is to pre-fill the array with a value you know will never be there. Then you can look through the array afterwards and see which elements have never been set to a useful value. – Weather Vane Sep 27 '18 at 18:51
  • @Tasos: I sympathise with your point of view, but questions about the psychology of computer programmers (even of the subset who contribute to SO) are out of scope. However, I think that there's far from enough information here to give a useful answer, which is a contributing factor to the lack of such a beast. What malloc implementation is being used, for example? Since malloc is underneath there managing the memory, that's crucial. And what's sizeof(void*); IOW, if malloc wants to store a couple of pointers in a free block, how much space does it take? Are the blocks being freed? Etc. – rici Sep 28 '18 at 00:28
  • Reasoning about undefined behavior is almost always a futile attempt at understanding chaos. The point about undefined behavior is not that the system will do something different each time, the point is **that it is allowed to**. There can be *any number* of reasons why there is a 0 in that memory location, none of which may hold true in the future. Almost *always*, the reason for trying to understand what undefined behavior means in a given case **is that someone wants to depend on this behavior**. ***Don't do that!*** – Lasse V. Karlsen Sep 28 '18 at 06:09
  • @rici not sure what you mean by "questions about the psychology of computer programmers". The question was not a "why do I feel this way", it was a very specific question. Three commenters here (WeatherVane, Eugene, and Lasse) have given good and useful answers which combined could have made a great canonical answer to this question, providing some insight into how and why such behaviour might take place w.r.t. a compiler and typical memory use (which is presumably what OP was really asking for) as well as why it should thus not be relied upon, regardless of superficial consistency. – Tasos Papastylianou Sep 28 '18 at 10:01
  • @Tasos: I probably wasn't being clear enough. I agree that the original question is clear and has nothing to do with psychology. You raised the psychological question, at least implicitly, although I don't see your comment right now. And I thought it was a good question, but sadly out of scope here. As far as the answers and comments go, I don't think that any of them really address the most likely explanation, which is that the memory at the beginning of the block is used by the malloc implementation to link free blocks together. (ptmalloc definitely does this, for example.) – rici Sep 28 '18 at 20:09

2 Answers2

2

*x is equivalent to *(x + 0) and *(x + n) is equivalent to x[n] where x is a pointer and n integer. Hence you're printing x[0] and x[1] - the first and second elements of an integer array.


The bytes of an object allocated with malloc are indeterminate unless initialized, and hence the values of the objects too. The standard says (C11 3.19.2-3.9.4; same text in C17 but cannot link it as nicely):

indeterminate value

either an unspecified value or a trap representation

and

unspecified value

valid value of the relevant type where this International Standard imposes no requirements on which value is chosen in any instance

NOTE An unspecified value cannot be a trap representation.

and

trap representation

an object representation that need not represent a value of the object type

An int object cannot have a trap representation in GCC. However, the behaviour is still not well-specified, as the standard does not impose requirements on which value is chosen in any instance - so even

printf("%d, %d\n", x[0], x[0]);

can print

0, 42

so "checking" the indeterminate values is meaningless.

The "as-if rule" allows the compiler to elide the call to malloc altogether - so that even if the malloc implementation always gave a block with first 4 bytes zeroed, the compiled code could have any changing number for x[0].

Community
  • 1
  • 1
  • Upvoting for the explanation of checking uninitialized values with a second array. Still unsure of why `*x` is consistently `0`. Even with `gcc` and `clang` and testing against `printf("%d, %d", *x, x[0])` – wpcarro Sep 27 '18 at 18:33
  • 1
    Because on your environment it is just happening that something or someone is always writing zero to it before your program (or this function) is running. Or the compiler might even take the UB literally and just print `0` for any uninitialized value if it finds it beneficial for optimization. – Eugene Sh. Sep 27 '18 at 18:35
  • 1
    And "consistently" might just be "as far as I have observed". Any number of future changes to your program, your computer, framework, compilers, heck even things on the internet might affect this (if for some reason this value is left behind after calling a function that downloads and image). The point about undefined behavior isn't that "your current system does something different each time", the point is **that it is allowed to**. – Lasse V. Karlsen Sep 28 '18 at 06:08
0

I'm assuming that *x will reference the first uninitialized int while *(x + 1) will reference the second uninitialized int.

Your assumption is correct. But as you say yourself: You are dereferencing pointers to uninitialized data that has indeterminate values. But only from looking at the data it is impossible to tell if it is initialized or not.

To initialize the memory use calloc():

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
    int *x = calloc(2, sizeof *x);
    printf("%d, %d\n", x[0], x[1]);
}

I'd eventually like a way to loop through the fixed sized array and only print the values that have been initialized. I'm not sure how to accomplish this.

Use a second array to keep track of the assignments:

#include <assert.h>
#include <stddef.h>
#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>

enum { LENGTH = 4 };

int main(void)
{

    int *data = calloc(LENGTH, sizeof *data);
    bool *initialized = calloc(LENGTH, sizeof *initialized);

    data[1] = 0;
    initialized[1] = true;

    data[3] = 42;
    initialized[3] = true;

    for (size_t i = 0; i < LENGTH; ++i)
        if (initialized[i])
            printf("%zu: %d\n", i + 1, data[i]);
}
Swordfish
  • 12,971
  • 3
  • 21
  • 43
  • is it UB, the behavior is precise, 'display the values that are stored at that location'. The program does it, and does it reliably – pm100 Sep 27 '18 at 18:26