12

Lets say we define two structures in either C or C++ (I get the same result in both C and C++ and I think the rules are the same, tell me if they are not).

one with a value member of an incomplete type:

struct abc{
  int data;
  struct nonexsitnow next; //error: field ‘next’ has incomplete type, makes sense since "struct nonexsitnow" hasn't been declared or defined.
};

and one with a pointer member of an incomplete type:

struct abc{
  int data;
  struct nonexsitnow *next; //pass?!
};

Why doesn't the second definition cause any issue? It uses struct nonexsitnow which hasn't been created!

UPDATE:

I conclude this sheet from answers and comments below, hoping they're right and helpful for elaborations.

As @ArneVogel has mentioned, both struct Foo p; and struct Foo *p; is an implicit declaration of Foo, and struct Foo; explicitly does this job(thanks @John Bollinger). This means nearly nothing for but makes a difference for , and behaviors:

               In a situation where struct Foo is:
_______________________________________________________
|               | undeclared | declared but | defined |
|               |            | not defined  |         |
-------------------------------------------------------
|               |  C | C++   |  C  |  C++   | C | C++ |
-------------------------------------------------------
|struct Foo  p; |  × |  ×    |  ×  |   ×    | √ |  √  |
|struct Foo* p; |  √ |  √    |  √  |   √    | √ |  √  |
|       Foo  p; |  × |  ×    |  ×  |   ×    | × |  √  |
|       Foo* p; |  × |  ×    |  ×  |   √    | × |  √  |
alandawkins
  • 309
  • 3
  • 10

4 Answers4

34

Why does second struct definition doesn't cause any problem?

Because it is a pointer. The size of the pointer is known to the compiler even if the type the pointer is pointing to is incomplete.

JFMR
  • 23,265
  • 4
  • 52
  • 76
  • The fact that it is a pointer to a class type is also important. Other pointer-types can have different representation, though only unions also give rise to pointers to incomplete types. Well, MSVC also has trouble with member-function- / member-data-pointers, but that's a point of non-conformance of long standing. – Deduplicator Jan 30 '18 at 18:03
8

Compiler needs the size of a struct / class in order to know how much memory has to be allocated when such type is instantiated.

On a given platform, sizeof(T*) will always return the same value, for any type T that is a struct or class type.

That is why you can use pointers to forward-declared types with no errors. Of course, to access content of an object pointed to by such pointer or dereference it, definition must be provided. You can, however assign value to such pointer (as long as it is allowed in terms of type compatibility).

An important fact:

In C, where you typically use so-called "C-style cast", pointer assignment can usually be performed regardless the types (it is your responsibility to ensure the correct behavior and fulfill alignment requirements).

In C++ however, whether cast between incomplete types is possible depends on the type of the cast. Consider two polymorphic types:

class A; // forward declaration only
class B; // forward declaration only, actually inherits from A

A* aptr;
B* bptr;

bptr = (B*)(aptr); // ok
bptr = dynamic_cast<B*>(aptr); // error

dynamic_cast<> will fail if compiler does not have access to the definition of types involved in the cast (which is necessary to perform runtime-check). Example: Ideone.

prl
  • 11,716
  • 2
  • 13
  • 31
Mateusz Grzejek
  • 11,698
  • 3
  • 32
  • 49
  • 3
    Everyone seems to be missing an important point here: Just writing `struct Foo *f;` will implicitly forward-declare `Foo`. This works only when `struct` etc. is used, is an inherited behavior from C, and seems to be a point of confusion to the OP. – Arne Vogel Jan 30 '18 at 14:44
  • 2
    You are wrong. Pointers to different types, especially function-pointers, may differ. Though all struct-/class-pointers must have the same representation. – Deduplicator Jan 30 '18 at 17:59
  • @Deduplicator *Compiler needs the size of a **struct / class**..* I think it's pretty clear, that my explanation goes around types. – Mateusz Grzejek Jan 30 '18 at 18:55
  • 3
    You said "*On a given platform, `sizeof(T*)` will always return the same value, regardless what T is.*" which is patently wrong. If you restricted yourself to pointers to class-types, you would have been ok. – Deduplicator Jan 30 '18 at 18:57
  • 1
    @ArneVogel "_Just writing `struct Foo *f;` will implicitly forward-declare `Foo`_", I just thought that it means "**ok, you want a pointer to a struct.**", in fact it's "**you declare a `struct Foo` and want a pointer to it**". Right? – alandawkins Feb 01 '18 at 06:10
  • 1
    @AlanDawkins Indeed, and [here is a demo](https://godbolt.org/g/m4onnm). – Arne Vogel Feb 01 '18 at 09:55
6

In the first place, you need to understand what it means for a type to be "incomplete". C defines it this way:

At various points within a translation unit an object type may be incomplete (lacking sufficient information to determine the size of objects of that type) or complete (having sufficient information).

(C2011, 6.2.5/1)

Note well that type completeness is a function of the scope and visibility of declarations, not an inherent characteristic of types. A type can be incomplete at one point in a translation unit, and complete at a different point.

However,

A pointer type may be derived from a function type or an object type, called the referenced type. [...] A pointer type is a complete object type.

(C2011, 6.2.5/20; emphasis added)

Without qualification, then, all pointer types are complete types, even pointers whose referenced types are not themselves complete. How a particular implementation makes this work is not addressed by the standard, but ordinarily, all pointer-to-structure types have the same size and representation (which has nothing to do with the representation of their referenced types).

This turns out to be important, because a structure type is incomplete until the closing brace of its definition, so if pointers to incomplete types were not themselves complete, then a structure could not contain a pointer to another structure of its own type, such as is commonly used to implement linked lists, trees, and other data structures.

On the other hand,

A structure or union type of unknown content [...] is an incomplete type. It is completed, for all declarations of that type, by declaring the same structure or union tag with its defining content later in the same scope.

(C2011, 6.2.5/22)

This stands to reason, since the compiler cannot know how big a structure type is if it does not know what its members are. It then furthermore makes sense that

A structure or union shall not contain a member with incomplete or function type (hence, a structure shall not contain an instance of itself, but may contain a pointer to an instance of itself), except that the last member of a structure with more than one named member may have incomplete array type [...].

(C2011, 6.7.2.1/3; emphasis added)

The exception describes a C feature called a "flexible array member", which comes with several caveats and restrictions. That's a tangential matter that you can read (or ask) about separately.

Additionally, all of the foregoing is consistent with the fact that C and C++ permit you to reference a structure type by its tag prior to its members being declared; that is, when it is an incomplete. This can be done on its own as a forward declaration ...

struct foo;

... but that doesn't serve any but documentary purposes, because forward declaration of structure types is not required. You can think again of the linked-list usage, but this characteristic is in no way limited to such contexts.

Indeed, a relatively common use case is to implement opaque types. In such a case, a library produces and consumes a data type whose implementation it does not want to disclose, for any of a variety of reasons. It can nevertheless hand out appropriately-typed pointers to instances of such structures to client code, and expect to receive such pointers back. If it never provides a definition of the referenced type, then the client code has to treat the referenced objects as opaque blobs.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • "_Without qualification, then, all pointer types are complete types, even pointers whose referenced types are not themselves complete._" but I get an error when `nosuchtype *p;`, "**unknown type name 'nosuchtype'**" for gcc and "**use of undeclared identifier 'nosuchtype'**" for clang. – alandawkins Feb 01 '18 at 06:23
  • Maybe **A pointer type is a complete object type** when the referenced type is at least declared? – alandawkins Feb 01 '18 at 06:29
  • @AlanDawkins, the question of whether `nosuchtype *` designates a *complete* type makes sense only where it designates a type at all, which is in exactly those places where `nosuchtype` itself designates a type. This is a separate question, but no, it does not necessarily require a separate declaration. In particular, it does not require a separate declaration if `nosuchtype` is a structure type designated via the `struct` keyword and a tag: `struct anything`. – John Bollinger Feb 01 '18 at 13:07
  • "Without qualification," - no pun intended. – Casey Feb 02 '18 at 18:07
3

The short answer is, because the and standard both say so.


Others have answered "because the compiler knows how big pointers are". But that really doesn't answer why. There are languages that permit incomplete types inside other types, and they work fine.

If any of these assumptions are changed, C/C++ would support incomplete structs within structs:

  1. C/C++ actually stores values. Many languages, when given a composite data type (a class or struct) within another, store a reference or pointer instead of actual values of that composite data type

  2. C/C++ wants to know how big complete types are. It wants to be able to create arrays, calculate their size, calculate the offset between elements.

  3. C/C++ wants single-pass compiling. If the compiler was willing to note that there was an incomplete type there, continue compiling until it finds out later how big it is, then come back and insert the size of the type into the generated code, incomplete types would be fine.

  4. C/C++ wants types to be complete after you define them. You could easily insert a rule stating that abc was only complete once nonexistnow's definition was visible. Instead, C++ wants abc to be complete right after the closing }, probably for simplicities sake.

Finally, pointers to structs satisfy all of these requirements because the C++ standard demands that they do. Which seems like a cop out, but it is true:

On some platforms, the size of a pointer varies with the features of the thing pointed to (in particular, pointers to single-byte characters on platforms where native pointers address quad words are larger). C/C++ permits this, but requires that void* be large enough for the largest pointer, and that pointers-to-struct have a fixed size. This hurts such platforms, but they are willing to do this in order to permit pointers to incomplete structs inside complete structs.

Quite possibly the compiler would rather than a struct small { char c; } be 1 byte in size, and hence pointers to it be "wide"; but because all pointers-to-struct must have the same size, and they don't want to use wide pointers for every struct, instead we have sizeof(small) == 4 on such systems.

It isn't a law of computing that all pointers are the same size, and it isn't a law that structs have to know how big they are. These are both laws of and that where chosen for reasons.

But once you have those reasons, you are forced to conclude that members of structs have to have known size (and alignment), and incomplete structs don't. Meanwhile, pointers to incomplete structs do. So one is permitted, and the other not.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • 1
    @Deduplicator I mean, there are hardware platforms on which pointers to bytes have to be larger than pointers to integers. On such platforms, pointers to structs even if the struct is 1 byte are forced to be integer sized pointers, and structs with a char member have to waste space, or pointers to structs must all have the resolution of 1 byte; those are both "work arounds". – Yakk - Adam Nevraumont Jan 30 '18 at 20:40
  • "The short answer is, because the c and c++ standard both say so." That's not the short answer, the short answer is the top rated answer, "Because it is a pointer." You managed to be less concise, say less, and answer less, all while still being longer than the top answer despite being "the short answer". – Krupip Jan 30 '18 at 20:44
  • @snb My short answer was one sentence. It is quite short. I will add a line. Does that help? – Yakk - Adam Nevraumont Jan 30 '18 at 20:54
  • yeah, but quite a bit longer than "Because it is a pointer" and yours also only answers the question on technicality (yes, technically its because "the c and c++ standard say so" but since when is "because they said so" an adequate answer to *why*). I don't see what value this answer has in general, given the top two answers cover this much. Those answers manage to be much more readable, with less pointless exposition and run-on sentences. – Krupip Jan 30 '18 at 21:01
  • @snb "Because the compiler knows the size of the pointer" doesn't say *why*. You might as well say "because the type has a `*` in it". The reason why the compiler needs to know the size is because we want single-pass compilation and the ability to create storage for instances immediately after declaration and C/C++ actually stores values of struct/class type. Change *any* of these, and C/C++ would permit incomplete structs inside other structs. Stopping at "we know how big pointers are" is not much deeper than "the standard says so". – Yakk - Adam Nevraumont Jan 30 '18 at 21:06
  • '"Because the compiler knows the size of the pointer" doesn't say why.' Yeah it does. 'Stopping at "we know how big pointers are" is not much deeper than "the standard says so"' Its significantly deeper since it includes reasoning, IE, the why behind the behavior. – Krupip Jan 30 '18 at 21:15
  • @snb There, I made the chain of reasoning more explicit. – Yakk - Adam Nevraumont Jan 30 '18 at 21:18