17

Consider following sample code:

class C
{
public:
    int* x;
};

void f()
{
    C* c = static_cast<C*>(malloc(sizeof(C)));
    c->x = nullptr; // <-- here
}

If I had to live with the uninitialized memory for any reason (of course, if possible, I'd call new C() instead), I still could call the placement constructor. But if I omit this, as above, and initialize every member variable manually, does it result in undefined behaviour? I.e. is circumventing the constructor per se undefined behaviour or is it legal to replace calling it with some equivalent code outside the class?

(Came across this via another question on a completely different matter; asking for curiosity...)

curiousguy
  • 8,038
  • 2
  • 40
  • 58
Aconcagua
  • 24,880
  • 4
  • 34
  • 59
  • 5
    This particular code is fine, because `C` is a POD. As long as `C` is a POD, it can be initialized that way as well. – Nawaz Jun 05 '16 at 18:09
  • 5
    Technically it is no undefined behavior (as long as no constructor or destructor is invoked) : See also std::is_trivial. –  Jun 05 '16 at 18:18
  • 1
    @Nawaz: Please provide _evidence_ to support your claim. – Lightness Races in Orbit Jun 05 '16 at 18:38
  • @LightnessRacesinOrbit: C++ claims to be compatible with C, so I assume here C++ doesn't break that. – Nawaz Jun 05 '16 at 18:39
  • Regardless of the answer, which I won't claim to know and even the high-rep guys can't agree on ;-) what is an example of an actually useful case of this pattern? – underscore_d Jun 05 '16 at 21:07
  • @underscore_d Perhaps implementing a replacement for std::vector without reallocation. Avoiding reallocation could be relevant in an embedded environment. There, you often are required to operate on statically allocated memory only to prevent failure of the device during runtime due to exceeding total memory or memory being too much fragmented already (malloc here actually was just a showcase to retrieve uninitialized memory). Admitted, in those environments, one typically prefers C anyway... – Aconcagua Jun 06 '16 at 04:11
  • @underscore_d Ah, in a real implementation, I would call the placement constructor then in any case. Came across this via this [answer](http://stackoverflow.com/a/37378349/1312382), where it wasn't done. Of course, *there* the better/correct way would have been using `new[...]`. But I got curious... – Aconcagua Jun 06 '16 at 04:31
  • @underscore_d Implementing a custom memory management, avoiding allocation and deallocation as much as possible (perhaps for a high performance application), would probably result in allocating large junks of memory at once and using these - again via the placement constructor, though. – Aconcagua Jun 06 '16 at 04:34
  • 1
    @Nawaz: C++ claims no such thing, and this is one instance in which your assumption is ill-founded and has led to you spreading misinformation. – Lightness Races in Orbit Jun 06 '16 at 08:28
  • 2
    @LightnessRacesinOrbit: Explain further please. – Nawaz Jun 06 '16 at 08:45
  • 2
    @Nawaz: There are many ways in which C++ is not compatible with C. The necessity of casting the result of `malloc`, for example. Assuming that something is true in C++ because it is true in C is patently false. – Lightness Races in Orbit Jun 06 '16 at 09:11
  • 1
    @LightnessRacesinOrbit: I know there are many ways C++ is incomplatible with C. That is not a news to me at all. But I'm talking about this *specific* case, which I think, is compatible with C (except of course the casting which is not needed in C). – Nawaz Jun 06 '16 at 09:14
  • @Nawaz: That's not what you stated earlier. You stated that you _assumed_ this case was the same as C. And I quote: _"C++ claims to be compatible with C, so I assume here C++ doesn't break that"_. I am merely correcting this faulty logic. – Lightness Races in Orbit Jun 06 '16 at 09:16
  • @LightnessRacesinOrbit: How does I differ from it? Except that I clarified that the casting is necessary in C++, being it strongly typed compared to C. – Nawaz Jun 06 '16 at 09:21
  • @Aconcagua So, that's a no, then? ;-) – underscore_d Jun 06 '16 at 12:05
  • 1
    @underscore_d It certainly is UB if a non-trivial ctor is available, but seems to be legal for trivial (POD) objects according to the answers below. But although being legal for POD, I still have a bad feeling with not calling the (placement in this case) constructor. You are simply on the safe side if a non-trivial one is ever added... – Aconcagua Jun 06 '16 at 15:33
  • @Aconcagua Having read through the thread that **T.C.** edited into their answer, I now lean towards agreeing that 'by the book' this _is_ UB, even for trivial types. Not saying it _should_ be, and I'm sure discussions are ongoing about whether the wording should be changed, but that seems to be how it is. – underscore_d Jun 08 '16 at 13:49
  • @underscore_d This is where I am really not sure about. It seems to me that the citation provided by Niall is more precise in this matter (life time and vacuous initialisation), and the drafting note cited by T.C. is not to be found in the standard itself (got the latest draft as linked [here](http://stackoverflow.com/a/4653479/1312382). This is why accepted Niall's answer instead of T.C.'s. Almost seems to me as if the standard is contradicting itself in this matter... Conclusion is easy, however: if you want to be on the safe side, call the constructor... – Aconcagua Jun 08 '16 at 16:00
  • 1
    @Aconcagua Well, that thread only 'joins the dots' between other parts of the Standard already cited. In cases where the Standard is unfortunately lacking in clarity, I have to go with what folk on the Committee say about what it _should've_ said... even when they're saying it should probably be changed to say something _else_. That's just everyday Standard acrobatics! My head is spinning. But to me, where UB is concerned - for the sake of safety - if there's smoke, there's fire. So, for me, Std fails to mention `malloc` as a valid object creator, & a Committee member says it's not => it's not – underscore_d Jun 08 '16 at 19:52
  • @underscore_d Reading answers and comments again and again, and the more I think about it, the more I tend to follow your last comment here. Switched accepted answer... – Aconcagua Jun 08 '16 at 20:22
  • @Aconcagua That makes me nervous about whether I'm interpreting it all correctly, but thanks :-) – underscore_d Jun 08 '16 at 20:27
  • 1
    @underscore_d You need not be - your comment was good, but actually, several comments of T.C. made me change my mind, so it's him who should be nervous then... – Aconcagua Jun 08 '16 at 20:34
  • Why do you use `reinterpret_cast`? – curiousguy Jun 10 '16 at 23:21
  • 1
    You're right, static_cast does the job as well, adjusted the question. Actually, it's not of relevance for the question, though. – Aconcagua Jun 11 '16 at 03:29
  • 2
    You have just pointed out a catastrophic error in the standard documents, going back a long time. The claim that a new expression is necessary to create an object is an obvious contradiction in the std and contradicts common sense. – curiousguy Jun 11 '16 at 05:24
  • 1
    @curiousguy Indeed! How bad an error that was is emphasised by how [its eventual fix](https://stackoverflow.com/a/61999151/2757035) is a DR retroactive all the way to C++98. Better late than never! – underscore_d Jun 24 '20 at 09:21

7 Answers7

10

It is legal now, and retroactively since C++98!

Indeed the C++ specification wording till C++20 was defining an object as (e.g. C++17 wording, [intro.object]):

The constructs in a C++ program create, destroy, refer to, access, and manipulate objects. An object is created by a definition (6.1), by a new-expression (8.5.2.4), when implicitly changing the active member of a union (12.3), or when a temporary object is created (7.4, 15.2).

The possibility of creating an object using malloc allocation was not mentioned. Making it a de-facto undefined behavior.

It was then viewed as a problem, and this issue was addressed later by https://wg21.link/P0593R6 and accepted as a DR against all C++ versions since C++98 inclusive, then added into the C++20 spec, with the new wording:

[intro.object]

  1. The constructs in a C++ program create, destroy, refer to, access, and manipulate objects. An object is created by a definition, by a new-expression, by an operation that implicitly creates objects (see below)...

...

  1. Further, after implicitly creating objects within a specified region of storage, some operations are described as producing a pointer to a suitable created object. These operations select one of the implicitly-created objects whose address is the address of the start of the region of storage, and produce a pointer value that points to that object, if that value would result in the program having defined behavior. If no such pointer value would give the program defined behavior, the behavior of the program is undefined. If multiple such pointer values would give the program defined behavior, it is unspecified which such pointer value is produced.

The example given in C++20 spec is:

#include <cstdlib>
struct X { int a, b; };
X *make_x() {
   // The call to std​::​malloc implicitly creates an object of type X
   // and its subobjects a and b, and returns a pointer to that X object
   // (or an object that is pointer-interconvertible ([basic.compound]) with it), 
   // in order to give the subsequent class member access operations   
   // defined behavior. 
   X *p = (X*)std::malloc(sizeof(struct X));
   p->a = 1;   
   p->b = 2;
   return p;
}
Community
  • 1
  • 1
Amir Kirsh
  • 12,564
  • 41
  • 74
  • 2
    Hurray, my code suddenly started behaving in a defined way after all these years (: – jwd Jun 23 '20 at 18:55
  • Wow, backported to C++98?!? Mind blown. How common is that? Mind you, something that made probably oodles of extant code UB, chiefly including anything which used `malloc()`... had to be treated as such, I guess. Between this and P1839R2, things are looking up - eh, @jwd ? ;-) I might not have to rewrite so much of that old code one day! – underscore_d Jun 24 '20 at 09:08
  • Seems fine for my example – if adding a custom constructor (with parameters or default), though, we wouldn't be able to implicitly create an object any more, I assume? So then still UB? – Aconcagua Jun 25 '20 at 06:52
  • @Aconcagua Yes, it is still UB because `X` would no longer be an [*implicit-lifetime type*](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0593r6.html#affected-types) (the link explains the rationale). However, you can still interpret `p` as an array of `X` and do pointer arithmetic or placement new on it, but not accessing the members. – ph3rin Jan 01 '21 at 01:38
5

There is no living C object, so pretending that there is one results in undefined behavior.

P0137R1, adopted at the committee's Oulu meeting, makes this clear by defining object as follows ([intro.object]/1):

An object is created by a definition ([basic.def]), by a new-expression ([expr.new]), when implicitly changing the active member of a union ([class.union]), or when a temporary object is created ([conv.rval], [class.temporary]).

reinterpret_cast<C*>(malloc(sizeof(C))) is none of these.

Also see this std-proposals thread, with a very similar example from Richard Smith (with a typo fixed):

struct TrivialThing { int a, b, c; };
TrivialThing *p = reinterpret_cast<TrivialThing*>(malloc(sizeof(TrivialThing))); 
p->a = 0; // UB, no object of type TrivialThing here

The [basic.life]/1 quote applies only when an object is created in the first place. Note that "trivial" or "vacuous" (after the terminology change done by CWG1751) initialization, as that term is used in [basic.life]/1, is a property of an object, not a type, so "there is an object because its initialization is vacuous/trivial" is backwards.

T.C.
  • 133,968
  • 17
  • 288
  • 421
  • 3
    `C` is POD. If so, then I dont see *why* that should be UB. – Nawaz Jun 05 '16 at 18:12
  • 3
    At least that is what http://en.cppreference.com/w/cpp/concept/PODType says (you can create PODs via malloc). I currently can't find hard evidence in the standard but it allows conversion of `T*` into `U*` if alignment req. of `U` is not stricter than `T` and if both are standard-layout types. (Where there also isn't any actual `U` object around which is created via definition, new-expression or by the implementation.) – Pixelchemist Jun 05 '16 at 18:27
  • 1
    @Pixelchemist Then cppreference needs fixing. – T.C. Jun 05 '16 at 18:34
  • 1
    @Nawaz: Just because type `C` is POD doesn't mean you can treat any old uninitialised block of memory as a `C`. The program still needs to know that you want an object of type `C` there. C++ is an _abstraction_ over bytes in memory. – Lightness Races in Orbit Jun 05 '16 at 18:35
  • I think that comes from the language C... if not, then is this an example where C++ is incompatible with C? – Nawaz Jun 05 '16 at 18:37
  • 7
    So does that apply to `int` too? If not, why not? If yes, can we `malloc` anything at all? – Baum mit Augen Jun 05 '16 at 18:57
  • 1
    @BaummitAugen Yes. The status quo is that you can `malloc` anything you want, but what you get is just raw storage. To create an object, you need placement new. – T.C. Jun 05 '16 at 18:59
  • 2
    @T.C.: That makes it incompatible with C? – Nawaz Jun 05 '16 at 19:03
  • BaumMitAugen's answer was correct, and even provided the Standard quote. Trivial initialization *is* a property of the type. See the type trait `std::is_trivially_constructible` – Ben Voigt Jun 05 '16 at 20:01
  • 1
    @BenVoigt Not in the sense used in [basic.life]/1, which is why CWG1751 changed the terminology to "vacuous initialization". In `C c = /*...*/; C c1 = c;`, `c1` has non-vacuous (or non-trivial) initialization even if `C` is trivial - this even is called out in the note in [basic.life]/1. – T.C. Jun 05 '16 at 20:05
  • 3
    @T.C.:Your whole answer relies on one excerpt from `[intro.object]` taken out of context and assumed to be an exhaustive list when it is not. Beginning of same paragraph: **An object is a region of storage.**. `malloc` most certainly allocates an object. – Ben Voigt Jun 05 '16 at 20:21
  • 5
    Consider also this footnote "39) This section does not impose restrictions on indirection through pointers to memory not allocated by `::operator new`. This maintains the ability of many C++ implementations to use binary libraries and components written in other languages. In particular, this applies to C binaries, because indirection through pointers to memory allocated by std::malloc is not restricted." – Ben Voigt Jun 05 '16 at 20:24
  • 1
    @BenVoigt "An object is a region of storage" doesn't imply "every region of storage is an object". And I fail to see the relevance of a non-normative footnote in a section not about lifetime, but about safely-derived pointer values. – T.C. Jun 05 '16 at 20:38
  • 3
    @T.C.: Likewise "A, B, and C create objects" does not mean that "A, B, and C are the only ways to create objects". – Ben Voigt Jun 05 '16 at 20:50
  • 2
    @BenVoigt Sure, but what's the other way you are talking about? The standard says A, B, and C can create an object; it doesn't say anything else creates an object. – T.C. Jun 05 '16 at 20:56
  • @T.C. The thing that troubles me with this logic is that the initialisation is specified to have no effect ([dcl.init]p7.3). If the initialisation has no effect, how can there be a difference between the initialisation being performed, and the initialisation being bypassed? –  Jun 06 '16 at 19:37
  • 4
    @hvd A placement new (since that's what's being bypassed here) has two effects: 1. it creates an object; 2. it runs the initialization. Just because #2 has no effect doesn't mean you can bypass #1. – T.C. Jun 06 '16 at 19:40
  • @T.C. Ah, thanks, that's the logic I wasn't seeing, that makes sense. –  Jun 06 '16 at 19:45
  • +1 for adding the link to the proposals thread. That seems strongly to support the idea that this has been and still is UB. Now, of course, whether it _really_ was always meant to be, or should be, can be questioned - but as they say in the thread, that's a separate discussion. – underscore_d Jun 08 '16 at 13:38
  • The document you are quoting is a crazy text. – curiousguy Jun 09 '16 at 00:45
  • @BenVoigt "_does not mean that "A, B, and C are the only ways to create objects_" I am pretty sure this is the intended meaning. It's craziness. Deal with it – curiousguy Jun 09 '16 at 00:47
  • ITT people voting according to what they think C++ should be like, not what the standard actually says – M.M Jun 11 '16 at 03:55
  • @M.M Sometimes the std says crazy stuff and should be and is ignored by intelligent people. Should the std say that malloc and a cast doesn't create an aggregate object, this would be one of these cases. – curiousguy Jun 11 '16 at 04:10
  • So even `malloc(sizeof(int))` doesn't work and cannot be made to work, now? Given the fact that [intro] is boring and nobody cares much about [intro] and the real stuff is discussed in [basic.life], the conclusion is obvious: [basic.life] wins. – curiousguy Jun 11 '16 at 05:39
  • @curiousguy Just a (late) thought: Wouldn't `int* n = (int*) malloc(sizeof(int)); new(n) int(7);` do the trick legally? – Aconcagua Apr 28 '17 at 08:08
  • 1
    @Aconcagua If you admit you need placement new here, you have to recognize that there isn't any valid dynamic program in the C/C++ common subset. The original idea of C++ as close to C as possible is dead! – curiousguy Apr 28 '17 at 23:15
  • By same logic is `int* i = (int*)malloc(sizeof(int)); *i = 1;` UB? Because "malloc" is used, it can be argued that the region of storage allocated doesn't correspond to an "object" and since we don't have an object, this code should also be UB? – Cheshar May 17 '20 at 14:25
3

I think the code is ok, as long as the type has a trivial constructor, as yours. Using the object cast from malloc without calling the placement new is just using the object before calling its constructor. From C++ standard 12.7 [class.dctor]:

For an object with a non-trivial constructor, referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior.

Since the exception proves the rule, referrint to a non-static member of an object with a trivial constructor before the constructor begins execution is not UB.

Further down in the same paragraphs there is this example:

extern X xobj;
int* p = &xobj.i;
X xobj;

This code is labelled as UB when X is non-trivial, but as not UB when X is trivial.

rodrigo
  • 94,151
  • 12
  • 143
  • 190
  • 1
    Thanks, that is a good hint - in case of a non-trivial X, it is even stricter than my code. However, in case of a trivial one, it seems weaker to me, as only the address of the member is taken, bot not a value assigned. – Aconcagua Jun 06 '16 at 05:06
  • 1
    First, this is different because there's unambiguously an object created by the definition `X xobj;`, whereas the `malloc` case is at a minimum substantially more questionable. Second, while *referring* "to a non-static member of an object with a trivial constructor before the constructor begins execution" does not have UB, in cases where such an object has non-vacuous initialization (e.g., initialization by a trivial copy constructor) [basic.life]/6-7 severely constrains what you can do with it before its lifetime starts: any attempt to *access* such a member results in UB. – T.C. Jun 06 '16 at 16:02
  • @T.C.: I'm not sure I understand you. First, the quote from [class.dctor] talks about an _object with a trivial constructor_, not an _object trivially constructed_. Second, an object allocated with `malloc` will never use a copy constructor: as per [basic.life]/1, if it has trivial initialization its lifetime will start as soon as the storage is obtained, that is when `malloc` returns. – rodrigo Jun 06 '16 at 18:00
  • 1
    1) You are still operating on the assumption that `malloc` creates an object, which a) is at least debatable, b) is not the reading of the core working group (as far as I can tell, based on the publicly available discussions on P0137), and c) will likely be unambiguously clarified the other way soon-ish by a revision of that paper. 2) [class.cdtor]/1 does not disallow "referring" to a non-static data member if the object has no non-trivial constructor. But "accessing" (that is, reading or writing) said data member is still UB per [basic.life]/6-7 if its initialization is non-vacuous. – T.C. Jun 06 '16 at 18:12
  • 1
    @T.C.: 1a) `malloc` allocates storage of appropriate size and alignment, so if the type is trivially constructible, it creates an object, AIUI. 1b) Where can I read P0137? I cannot find it anywhere. 2) You say _if its initialization is non-vacuous_, but I stand that `malloc` of a trivially constructible type is a vacuously initialized object, so any reading or writing of its members is well defined. – rodrigo Jun 06 '16 at 18:26
  • 1
    1b) http://wg21.link/P0137; Richard Smith's current WIP draft can be found [here](http://rawgit.com/zygoloid/wg21papers/master/wip/d0137r1.html). That paper went through multiple CWG reviews, starting from when it was [N4430](http://wg21.link/N4430), and has consistently maintained that the status quo is "`malloc` alone is not sufficient to create an object", so I think it's fair to assume that CWG is in agreement with that statement. – T.C. Jun 06 '16 at 19:32
  • 2) I wasn't talking about `malloc`. The point I was trying to make is that while having no non-trivial constructors allows one to avoid the UB in [class.cdtor]/1, [basic.life] may still impose additional constraints, depending on how the object is initialized. – T.C. Jun 06 '16 at 19:36
  • @T.C. Thanks for the links, unfortunately the references to `malloc` as quite marginal there. 2) The funny think about [basic.life] is that if the object has a vacuous initialization, then the object lifetime begins before such initialization takes place! What if you vacuously initialize the object twice? The standard says nothing about that, but my understanding is that it is ok, as long as all of them are vacuous, or else the UBness of this code may depend on how the object is initialized in the future! – rodrigo Jun 06 '16 at 20:00
  • Considering that the very first change redefines *object* (that's what the italics means) in a way that clearly excludes `malloc`, I don't see how that's marginal. – T.C. Jun 06 '16 at 20:03
  • "the exception proves the rule" is not a principle followed by the Standard. See [Denying the antecedent](https://en.wikipedia.org/wiki/Denying_the_antecedent) – M.M Jun 11 '16 at 04:16
  • @curiousguy If you can do better, there is a seat waiting for you :) – M.M Jun 11 '16 at 04:18
1

For the most part, circumventing the constructor generally results in undefined behavior.

There are some, arguably, corner cases for plain old data types, but you don't win anything avoiding them in the first place anyway, the constructor is trivial. Is the code as simple as presented?

[basic.life]/1

The lifetime of an object or reference is a runtime property of the object or reference. An object is said to have non-vacuous initialization if it is of a class or aggregate type and it or one of its subobjects is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-vacuous initialization. — end note ] The lifetime of an object of type T begins when:

  • storage with the proper alignment and size for type T is obtained, and
  • if the object has non-vacuous initialization, its initialization is complete.

The lifetime of an object of type T ends when:

  • if T is a class type with a non-trivial destructor ([class.dtor]), the destructor call starts, or
  • the storage which the object occupies is reused or released.

Aside from code being harder to read and reason about, you will either not win anything, or land up with undefined behavior. Just use the constructor, it is idiomatic C++.

Community
  • 1
  • 1
Niall
  • 30,036
  • 10
  • 99
  • 142
  • 3
    I find it often very useful to avoid the default ctor for `std::complex`. The standard requires this to be implemented as two doubles and there is no way to avoid them being set to zero if you default-construct the object. However, that is a very much unnecessary write operation if you overwrite the associated memory in the next step anyway; e.g. due to it being the result of a matrix-matrix multiplication or somesuch. – Claudius Jun 05 '16 at 18:27
  • 2
    @Claudius: In that case you can use placement new instead of first doing a default initialization and then assigning new values. – Pixelchemist Jun 05 '16 at 18:30
  • 3
    @Pixelchemist All these placement-new calls will still eat up runtime, especially since they’d be in the hot inner loop (and in my particular case I’d have to change the BLAS implementation to achieve this, so it’s out of question). – Claudius Jun 05 '16 at 18:34
  • @Claudius Yeah, that's a good point. Those placement news *should* be optimised away if immediately followed by an assignment, and a quick check suggests that at least GCC is able to do that, but it may not be that easy with all compilers. –  Jun 06 '16 at 19:34
  • @hvd That is a big if. There may well be a big gap between the allocation and the first write and mixing the two will make things often more complicated. The issue is essentially that `std::complex<>` default ctor is not trivial and quite frankly, I don’t see why it couldn’t just leave the values uninitialised? – Claudius Jun 07 '16 at 19:06
  • @Claudius The gap between the allocation and the first write doesn't matter. You can allocate memory but leave it uninitialised. You can then delay the construction until you're ready to write a value. And none of this is a defense of `complex`'s default ctor's behaviour, I'm just trying to help work around it. –  Jun 07 '16 at 19:52
-1

This particular code is fine, because C is a POD. As long as C is a POD, it can be initialized that way as well.

Your code is equivalent to this:

struct C
{
   int *x;
};

C* c = (C*)malloc(sizeof(C)); 
c->x = NULL;

Does it not look like familiar? It is all good. There is no problem with this code.

Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • 1
    This answer has -4 with no justification in comments... :( should not even be allowed by SO! – curiousguy Jun 10 '16 at 19:39
  • 1
    @curiousguy maybe nobody wants to get into a lame argument. Many posters passionately defend their incorrect answers and never admit to being wrong , so a comment is often an invitation to wasting time followed by personal abuse. – M.M Jun 11 '16 at 03:51
  • 1
    @M.M Maybe nobody has evidence. – curiousguy Jun 11 '16 at 04:07
  • @curiousguy As this answer stands, all we can say is "There is no evidence for your claims", onus is on the answerer to provide evidence. On a language-lawyer question, answers should be backed up with quotes from standards documents. – M.M Jun 11 '16 at 04:13
  • @M.M And yet you provided no such quote. – curiousguy Jun 11 '16 at 04:14
  • @curiousguy I haven't posted an answer. (TC's answer is correct) – M.M Jun 11 '16 at 04:45
  • 1
    @M.M And he provided no quote from a std. – curiousguy Jun 11 '16 at 04:52
  • @curiousguy 3 quotes from standards-related documents are provided along with a coherent argument – M.M Jun 11 '16 at 04:58
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/114398/discussion-between-curiousguy-and-m-m). – curiousguy Jun 11 '16 at 05:06
  • @M.M: If the specific tag "language-lawyer" requires an answer to be backed up with quotes from the spec, then I'd say this tag was *added* much later after this answer was posted (see the history). As for quotes, there are thousands of answers without quotes. Also, the quote didn't seem necessary when the code (or its translation like I did) *looks* so familiar. Also, when the accepted answer says the same thing (plus quotes), I dont see why this got so many downvotes. – Nawaz Jun 11 '16 at 06:27
  • @Nawaz the accepted answer says it's UB, yours says it's OK – M.M Jun 11 '16 at 06:39
  • 1
    @M.M: Oops. The OP initially had accepted @rodrigo's answer, with which I agree. T.C's answer (which is currently the accepted answer) doesn't make sense to me, because if that is true, then `char * buff = malloc(100); strcpy(buff, "hello");` would also be UB, which is nonsense. – Nawaz Jun 11 '16 at 10:19
  • @Nawaz that is ill-formed, `void *` may not be converted to `char *` without a cast. I guess you mean `char * buff = (char *)malloc(100);` etc., and yes, that is UB. As you may have seen from TC's post, efforts are underway to improve the standard so that this is well-defined, but the fact remains that it currently is not. If you try to provide standard references to support your position you will find that there are no such references. – M.M Jun 11 '16 at 13:37
  • The `malloc` and `strcpy` functions do not begin lifetime of an object, and writing to space where no object exists is UB. – M.M Jun 11 '16 at 13:44
  • 1
    @M.M: What is *the definition* of an object? – Nawaz Jun 11 '16 at 14:23
  • @Nawaz various properties of objects are listed in [intro.object] , that's the closest thing the standard has to a definition. – M.M Jun 11 '16 at 14:28
  • 1
    @M.M: And which section says the arguments to the `std::strcpy` must be an *object*? which section says *"writing to space where no object exists is UB"*? I need evidence. – Nawaz Jun 12 '16 at 03:58
  • @Nawaz see C++14 [basic.life]/6 which describes what you're allowed to do with storage which has been obtained but has not had the lifetime of any objects begun in that storage yet. Although you could argue that if there is never going to be an object then that section doesn't apply. In which case look at [expr.ass]/2 which says that assignment is only defined when the left hand side refers to an object; and `strcpy` is defined as being equivalent to a series of char assignments. – M.M Jun 12 '16 at 06:21
  • 1
    @M.M Don't you see that the conclusion is patently absurd? – curiousguy Jun 12 '16 at 17:01
  • 1
    @M.M: Such code is valid in C. So does it mean C++ is incompatible with C in *this* case? If so, since which version of C++? – Nawaz Jun 12 '16 at 17:29
  • 1
    Without an explicit decision by core to break such a basic construct, we can conclude it's an oversight in a section historically known to be baldy written and full of issues. – curiousguy Jun 12 '16 at 18:01
  • 1
    @M.M: I'm also curious, in what way `int *p = (int*)malloc(sizeof(int));` differ from `int *p = new int;`? The malloc version is valid in C, but invalid in C++, is that your conclusion? – Nawaz Jun 12 '16 at 18:33
  • @curiousguy Yes, it is absurd and efforts are underway to improve the standard , as described in TC's post. Whether it's an oversight is moot. – M.M Jun 12 '16 at 21:20
  • @Nawaz that `new` expression obtains storage and creates an object. The `malloc` function, and the `operator new` function, obtain storage and do not create an object. See [expr.new] for more detail. All versions of ISO C++ have been the same in this respect (malloc does not create an object). – M.M Jun 12 '16 at 21:21
  • You guys seem puzzled about the rationale for C++'s rules. I guess you feel that malloc should create an object. So answer this: what is the type of the object created by `malloc(4)` ? Bear in mind that in C++ all objects have a type `T` fixed at their point of creation. There is no concept in the standard of un-typed objects like C has. – M.M Jun 12 '16 at 21:34
  • 1
    @M.M: Please answer to my *last* two comments in which I asked whether `int *p = (int*)malloc(sizeof(int));` differs from `int *p = new int;` in *any* way... If yes, then does it mean C and C++ are incompatible in *this* case? – Nawaz Jun 13 '16 at 03:47
  • @Nawaz In C, `new int` is a syntax error so they differ. In C++, the first does not create an object and the second does create an object, so they differ. C and C++ are different languages. IDK what you mean exactly by "incompatible". – M.M Jun 13 '16 at 03:55
  • 1
    @M.M: I'm not talking about the validity of `new int` in C. Of course, it is not valid. I'm talking about the malloc part. You can use it in C and C++, both.. and by "incompatible" I mean the expression `(int*)malloc(sizeof(int))` has different (and incompatible) semantic in C and C++? – Nawaz Jun 13 '16 at 04:49
  • @Nawaz C and C++ have different object models. (For example C has a concept of untyped objects, C++ doesn't). So I would not try to directly compare the meaning of code in the two different languages. I already clearly stated exactly what the malloc line does in C++; you can draw your own comparison if you want. – M.M Jun 13 '16 at 05:11
  • @M.M "_I guess you feel that malloc should create an object. So answer this: what is the type of the object created by malloc(4)_" every object type that fits, of course! So in practice `char`, `short`, `unsigned short`, `int`, `unsigned int`, on 32 bits architectures all kinds of pointers to objects and to functions... – curiousguy Jun 15 '16 at 21:04
  • @curiousguy so you're saying one `malloc` creates dozens of overlapping objects (in fact, infinite number since we could make various structs). I'd call that absurd. – M.M Jun 15 '16 at 21:17
  • @M.M Call that as you want, but it's what happens in C++. Deal with it! – curiousguy Jun 15 '16 at 21:18
  • @curiousguy According to you, but not according to the C++ Standard ... seems based on your comments on this thread you have trouble accepting the standardization process – M.M Jun 15 '16 at 21:19
  • @M.M "_C and C++ have different object models_" Maybe you should do some reading about the origin and design of C++. Compatibility with C was always a primary goal! If you don't know that, you don't understand anything about C++! – curiousguy Jun 15 '16 at 21:20
  • @M.M "_seems based on your comments on this thread you have trouble accepting the standardization process_" I know the std process, unlike you. I also know the std, unlike you. I don't have trouble. – curiousguy Jun 15 '16 at 21:22
  • 1
    @M.M: Lets keep this simple: if my C++ code calls a `C` API, which allocates memory (using `malloc` of course) for type `char*` and returns it, then can the caller which is C++, *use* this buffer in `strcpy`? – Nawaz Jun 16 '16 at 04:53
  • @Nawaz That's not covered by ISO standards AFAIK , it would be up to the particular compiler. Many aspects of mixing C code with C++ code are not covered by ISO standards either; e.g. standards do not guarantee that the same type has the same size in each language. The compiler vendor will decide something on their own (perhaps in collaboration with other compiler vendors) so that their C and C++ compilers work together. – M.M Jun 16 '16 at 05:19
  • @curiousguy The fact is that the C and C++ standards, as they stand today, specify different object models. It's in plain text in each standard document. It's not constructive for you to keep denying this, I won't respond further to your comments. – M.M Jun 16 '16 at 05:24
  • @M.M: Lets not talk about the compilers, as for compilers even what I've written in the answer *does work*. Lets limit our discussion to the standard only, according to which, as per *your* understanding, this answer and the other examples in the comments invoke UB? So I'm wondering what *part(s)* of C++ is compatible with C. Can you please give me *one* good example? – Nawaz Jun 16 '16 at 05:38
  • @Nawaz The C standard doesn't mention C++, and the C++ standard doesn't explicitly cover "Compatibility with C". There is `extern "C"` linkage specification, and deferral to the C standard for some library functions. I don't know why you are talking about "C and C++ compatibility" (whatever that is supposed to mean) because this thread is solely a C++ thread and the examples on this thread are solely C++ examples. Any issues with C interoperability are quite separate to the examples on this thread. – M.M Jun 16 '16 at 11:30
  • As I've stated many times, in C++ and according to the C++ standard, your code causes UB, and TC's answer is correct. If you doubt this then *use references in the standard to support your argument*. All I've seen in comments from you and curiousguy is hand-waving about what you imagine or wish C++ should be like, and standards be damned. – M.M Jun 16 '16 at 11:32
  • 1
    @M.M: I'm bringing C here because of two reasons: 1) C++ is considered to be backward compatible with C (to some extent at least?). 2) The examples used here are the boundaries, on one side is C and other side is C++. I'm trying to understand what is valid code in C++. So again, my question is, Is `std::strcpy(buffer, "Hello World");` valid in C++ if `buffer` is allocated by an C API (which does this `char * buffer = malloc(n);` where `n > std::strlen("Hello World");` )? It is a very simple question.. and I'm expecting an answer from you in the form of "yes" or "no". – Nawaz Jun 16 '16 at 12:08
  • 1
    @M.M You can't consider the words of this extremely poor specification called the standard without considering intent. The intent was always to be compatible. That the C std isn't quoted in core is 1) patently obvious 2) of no relevance here. You obviously have no understanding of programming languages semantics. – curiousguy Jun 16 '16 at 12:17
  • @M.M "_e.g. standards do not guarantee that the same type has the same size in each language_" again you show your poor grasp of C semantics, C++ semantics, and standardisation. Given the fact that representation of fundamental types is implementation dependent, there is no guaranty that two implementation of any of these languages are binary compatible. **It makes no sense to ask if any implementation of C++ must be compatible with some random C implementation.** – curiousguy Jun 16 '16 at 12:20
  • @Nawaz IMO that question cannot be correctly answered by either "yes" or "no". To get a better answer, post a new language-lawyer question (perhaps linking to this thread as reference). – M.M Jun 16 '16 at 12:23
  • @M.M "_The fact is that the C and C++ standards, as they stand today, specify different object models._" Which one? The latest C standard is even more baldy broken than the C++ standard, with that absurd specification of "effective type", a failed attempt at making C "high level". "_It's in plain text in each standard document. It's not constructive for you to keep denying this_" You are just making up stuff. I never wrote anything like that. **C/C++ is a thing.** Even if you deny that. – curiousguy Jun 16 '16 at 12:25
  • 1
    @M.M: What is *your* answer to my question as per *your* understanding? As for better answer, I might start a new thread, but that is a different matter altogether. – Nawaz Jun 16 '16 at 12:45
  • @Nawaz my understanding is that it's not covered by ISO standards , therefore the answer varies per compiler – M.M Jun 16 '16 at 21:01
  • @M.M: In other words, it is UB, pedantically speaking? or unspecified behaviour? What do you call it? – Nawaz Jun 17 '16 at 03:43
  • @Nawaz those are standard terms and this not covered by standard, so I wouldn't use any of those. I would use the terms I already used in comments. – M.M Jun 17 '16 at 11:49
  • 1
    @M.M: If something is *not* covered by the Standard, that is UB. Isn't it? – Nawaz Jun 17 '16 at 18:45
  • 1
    @M.M Behavior of program which is **not defined** by the std is **undefined** behavior. Since you don't seem to even grasp the fundamental concept of UB, you are absolutely not capable of discussing the meaning of the std. – curiousguy Jun 18 '16 at 00:30
  • @Nawaz you could say that, although in practice you would treat it differently to most UB situations – M.M Jun 18 '16 at 05:41
  • 1
    @M.M: So you're trying to say there are more than one *kinds* of UB? and few of them should be treated *"differently"* and the rest should be downvoted right away? :D – Nawaz Jun 18 '16 at 06:14
  • @Nawaz if you're referring to downvotes on your answer - that's nothing to do with where this argument has gotten to , some other situation in a different language does not impinge on this C++ question – M.M Jun 18 '16 at 13:45
  • 1
    @M.M: Now it is clear from your argument that there are *two* kinds of UB, one kind is allowed though, as per your understanding. Now I've another question: is `malloc` C API? – Nawaz Jun 18 '16 at 16:41
  • @Nawaz sorry but I'm not interested in this sort of pointless argument in comments. Use the Question box for questions. – M.M Jun 19 '16 at 04:46
  • 1
    @M.M What makes you believe explicitly undefined behavior and behavior which isn't defined explicitly are treated differently in practice? – curiousguy Jun 19 '16 at 21:25
-1

While you can initialize all explicit members that way, you cannot initialize everything a class may contain:

  1. references cannot be set outside an initializer list

  2. vtable pointers cannot be manipulated by code at all

That is, the moment that you have a single virtual member, or virtual base class, or reference member, there is no way to correctly initialize your object except by calling its constructor.

cmaster - reinstate monica
  • 38,891
  • 9
  • 62
  • 106
  • Virtual members, **virtual base classes**, reference members and **const members** will definitely not work w/o ctor. – curiousguy Jun 11 '16 at 01:01
  • "_Why the downvote?_" I don't know, but maybe because you don't really answer the question. – curiousguy Jun 11 '16 at 04:20
  • @curiousguy I have added the virtual base to my list now, thanks. However, I disagree about the `const` members: You are perfectly allowed to cast away constness to initialize them anyway, even though you really shouldn't do it. – cmaster - reinstate monica Jun 11 '16 at 19:51
  • @curiousguy Concerning answering the question: Yes, I did not directly answer it. I just gave counterexamples that make it clear why it is a really bad idea to try to circumvent constructor calls, and why there is no way for the language to avoid classifying these counterexamples as undefined behavior. Maybe I was not clear enough about my intentions? – cmaster - reinstate monica Jun 11 '16 at 19:56
  • 1
    "_You are perfectly allowed to cast away constness_" says who? – curiousguy Jun 16 '16 at 23:56
  • @curiousguy Most uses of `const` only ever had the semantics of logical constness: While it was not considered ok to cast away constness in most places, there is nothing in the machine to stop you from it. The only exception is static const data which is already encoded in the binary. For instance, logical constness means that an object is perfectly allowed to manage a cache, as long as that cache does not change its observable behavior. And to implement that cache, it needed to cast away constness before the introduction of the `mutable` keyword. – cmaster - reinstate monica Jun 17 '16 at 05:59
  • "_And to implement that cache, it needed to cast away constness before the introduction of the mutable keyword_" not in every case: you could also get away with just an indirection, no `const_cast` and no `mutable`! – curiousguy Jun 17 '16 at 22:54
  • @curiousguy There is no programming problem that can't be solved with another level of indirection ;-) But here is another example: You have a reference counted object and you need to be able to increment/decrement the ref-count when you only have a const pointer to it. Obviously, changing the ref-count won't affect logical constness, but you need to cast away constness from / use `mutable` on the ref-count to update it. Even C++11 would only require the ref-count to be atomic to make it internally synchronized. With that, it's perfectly legal for the ref-count to be `mutable`. – cmaster - reinstate monica Jun 18 '16 at 08:38
  • There is an inherent race condition with the observation of the ref count of shared object in MT program. But ref count has a special property that in a MT program, only the observation of value 1 is interesting, as it means that no other owner exists, hence owning references can only be created from this owner, hence no other thread can do that without cooperation (or a data race). Values greater than 1 means that other owners exist and in general we don't know which thread own them. Values greater than 1 could change async, but value 1 will not. – curiousguy Jun 18 '16 at 17:01
  • @cmaster: Your statement ***"You are perfectly allowed to cast away constness to initialize them anyway"*** followed by ***"even though you really shouldn't do it"*** does not make sense. One is negated by other. Also, your argument seems to be driven by belief, rather than rational argument. – Nawaz Jun 20 '16 at 09:18
  • @Nawaz The first statement is about what the standard allows, which is due to the intended semantics of `const`. The second statement is about good programming practice. **These two are not the same, and are not meant be the same.** As such, there is no contradiction in my statements. It is true that I'm not enough of a language lawyer to cite the precise paragraph and sentence of the standard that allows casting away constness. However, the mere existence of the `const_cast<>` syntax clearly shows that there are situations in which it is perfectly legal to use it. – cmaster - reinstate monica Jun 20 '16 at 15:00
  • @cmaster "_clearly shows that there are situations in which it is perfectly legal to use it._" It shows no such thing. – curiousguy Jul 25 '16 at 17:14
-2

I think it shouldn't be UB. You make your pointer point to some raw memory and are treating its data in a particular way, there's nothing bad here.

If the constructor of this class does something (initializes variables, etc), you'll end up with, again, a pointer to raw, uninitialized object, using which without knowing what the (default) constructor was supposed to be doing (and repeating its behavior) will be UB.

ForceBru
  • 43,482
  • 10
  • 63
  • 98