15

Consider this:

void* x = &x;

printf("%p\n", x);

Surprisingly, it compiles and runs with the output:

7ffb2f7248

How is it possible for x to take the address of itself when x hasn't been created yet?

Edit: Note that in this case there is no ambiguity present in regards to it's assignment, making it clear as to what exactly is questioned.

  • 6
    That one’s easy, since the object type is complete when the *declarator* (`x`) is recognized. `void *x[] = { &x };` is more fun since the type is not complete until the *init-declarator* is fully parsed. Or `void *x[] = { &x[2], 0, 0 };`. – Eric Postpischil Jul 29 '20 at 20:38
  • 2
    @EricPostpischil but most implementations just set the symbol. Linker does the job having the complete infmation – 0___________ Jul 29 '20 at 20:40
  • 3
    Quote: "Surprisingly, it compiles and runs..." Well, I would be more surprised if it didn't. What makes you think it shouldn't? The code defines a variable and initialize it - no strange things going on. – Support Ukraine Jul 29 '20 at 20:40
  • 6
    You should not tag questions like this with both C and C++; the answers are different for the different languages. Please delete one tag and, if you wish to ask about that language, enter a separate question. – Eric Postpischil Jul 29 '20 at 20:46
  • 4
    This is quite useful feature, it allows this common pattern of array allocation in C: `int len=10; int* array = malloc(len*sizeof*array);`. This way, the type of the array does not have to be repeated and it will not produce bugs when someone decides to change the array's type. – Quimby Jul 29 '20 at 21:55
  • 3
    Dup of [Error with C++ syntax, compiler doesn't warn or error for int v = func(&v);](https://stackoverflow.com/questions/51052610/error-with-c-syntax-compiler-doesnt-warn-or-error-for-int-v-funcv) (this question is exactly about taking address, questions about just self-referencing like `int x = x;` exist since at least 2010). [Why does this C code compile?](https://stackoverflow.com/questions/3239386/why-does-this-c-code-compile) has wording for C. Might be fine to mark as a dup of only the later question. – Language Lawyer Jul 30 '20 at 01:43
  • That has a function call, and it doesn't set `x` itself to its own address. –  Jul 30 '20 at 08:32
  • 1
    Another potential use of this feature is the small-container optimization: A container that contains a small buffer for its elements when there are only very few elements. In these cases, the container can be constructed with `Container x = { .data = &x.smallBuf };`. This can be found in some C codes, C++ codes would use proper constructors instead. – cmaster - reinstate monica Jul 30 '20 at 10:19
  • @4386427 It's just as (non-)surprising as a function being able to call itself. It's a feature of the language that either allows it, or doesn't (*cough* Fortran *cough*). – cmaster - reinstate monica Jul 30 '20 at 10:23
  • 1
    @super if we would mark questions as dups only if they match byte-by-byte, we won't be able to mark find dups – Language Lawyer Jul 30 '20 at 11:25

5 Answers5

26

C++ 2018 6.3.2 [basic.scope.pdecl] 1 says:

The point of declaration for a name is immediately after its complete declarator (Clause 11) and before its initializer (if any), except as noted below.

The “noted below” items do not apply here, so, in void* x = &x;, x is declared after the first x (which is the declarator) and before the = &x (the initializer), so x can be referred to in the initializer, essentially as if the code were void *x; x = &x;.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
10

This answer is for the C-tag.

The expression:

void* x = &x;

is an init-declarator which is declarator = initializer

This can be found in section 6.7 of N1570 (draft C standard) and 6.7.6 says:

A full declarator is a declarator that is not part of another declarator. The end of a full declarator is a sequence point.

And 6.2.1/7 says:

Structure, union, and enumeration tags have scope that begins just after the appearance of the tag in a type specifier that declares the tag. ...<some text not quoted>... Any other identifier has scope that begins just after the completion of its declarator.

These sections tell that the object is created before it is initialized. Consequently, the initializer can use the address of the created objected.

In section 6.3.2.3 Pointers:

A pointer to void may be converted to or from a pointer to any object type. A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

Therefore - since any pointer can be converted to a void-pointer - the address of x can be assigned to a void-pointer.

So for your code:

First the object is created and then it is initialized.

void* x = &x;
^^^^^^^ ^^^^^  
   |      |
   |      step 2: initialize object
   |
   step 1: create object, i.e. end of full declarator

For this code it's pretty much the same as an assignment.

void* x; // Create
x = &x;  // Assignment instead of initialization
         // But in this case it works just the same
Support Ukraine
  • 42,271
  • 4
  • 38
  • 63
6

You are conflating allocation and initialization.

Storage is allocated for x simply because it is defined to exist by this declaration. Initialization is a separate step that follows allocation of storage. Because of this, the address of x can be known during its own initialization.

(Note that such a pointer cannot be safely dereferenced until the pointed-to value is done being initialized, but there is no reason why the address cannot be taken prior to initialization.)

cdhowie
  • 158,093
  • 24
  • 286
  • 300
  • To add to this, during the initialization, `&x` is also valid because the compiler has already seen the `x` identifier and knows what `x` refers to - what type it is, how big it is, etc - so the `&` address-of operator is thus able to return *where* the storage for `x` is located, even though the *contents* of that storage is still in process of being initialized. – Remy Lebeau Jul 29 '20 at 20:35
  • @RemyLebeau usually compiler just leaves the symbol. Linker does the address job – 0___________ Jul 29 '20 at 20:39
  • 3
    @P__J__ That would be an *implementation detail* of how the `&` operator works. But the parsing of the `&x` expression itself is handled by the compiler, not the linker, so obviously `x` needs to be a valid identifier by the time the compiler reaches `&x`. And `x` is a valid identifier in the context of the *initialization* of `x`, having already been defined by the *declaration* of `x`. That is all I'm trying to point out. – Remy Lebeau Jul 29 '20 at 20:41
  • Yes but the object was already defined. And it is known. – 0___________ Jul 29 '20 at 20:42
  • Need normative reference for LL tags. – SergeyA Jul 29 '20 at 20:49
  • 2
    In addition to the allocation of storage error, this is irrelevant. In `extern int x; void *y = &x;`, we are able to refer to `x` even though no storage has been allocated for it yet (barring some prior definition). The “Because of this” in this answer is incorrect. – Eric Postpischil Jul 29 '20 at 21:00
  • Your answer doesn't fully address the question even without ll tag. For example, it doesn't explain why it is possible to take an address of a variable before it's definition is seemingly concluded (a natural consideration would be that the variable definition is complete after coma or semi-column). Your answer fails to address this. – SergeyA Jul 29 '20 at 21:01
  • This answer also fails to address the question of taking addresses of entities for which no storage has been allocated. – SergeyA Jul 29 '20 at 21:02
  • 1
    Such a declaration+initialization was not given in the question. The answer addresses _the specific declaration in the question._ It is not meant to apply to any declaration that could be formed in the language. – cdhowie Jul 29 '20 at 21:04
0

How is it possible for x to take the address of itself when x hasn't been created yet?

Because an object doesn't need to exist in order to know where it will be created.

eerorika
  • 232,697
  • 12
  • 197
  • 326
0

Very simple. Static storage duration objects address is known to the linker which fills all the addresses.

Automatic storage variables which are not optimized out and their address is used are easy for the compiler to calculate.

https://godbolt.org/z/e84xM3

0___________
  • 60,014
  • 4
  • 34
  • 74