2

Is it legal to assign to memory which does not have an object allocated in it? I ask this as in [expr.ass]/2 it says:

In simple assignment (=), the object referred to by the left operand is modified ([defns.access]) by replacing its value with the result of the right operand.

Which combined with [defns.access] do seem to imply that the left operand should refer to an actual object. An example which could trigger this would be allocating raw memory without any object allocated in it and then assigning to it:

int * foo = static_cast<int *>(::operator new(sizeof(int)));
*foo = 5;

This seems natural but I feel based on that set of definitions it may be illegal without first calling placement new for example on the allocated memory to allocate the object. Additionally I am not sure if that cast is valid without actual int objects having been created there yet, so that might be an issue with that example too.

Similarly, would preforming this action through an "imaginary" object representation like so be valid:

void * foo = ::operator new(sizeof(int));
int bar = 5;
std::memcpy(foo, &bar, sizeof(int));

This seems like it could be, but based on [basic.types]/3 as it says:

For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a potentially-overlapping subobject, if the underlying bytes ([intro.memory]) making up obj1 are copied into obj2, obj2 shall subsequently hold the same value as obj1.

Which seems to imply that again the destination needs to be an actual allocated object still and that this wouldn't work either.

I suppose the final confusion if neither of those examples are valid is how does something like std::memset work? Presumably it is doing the same, taking empty memory with nothing in it and then interpreting it through some arbitrary object representation to assign to. In fact the C11 standard says the following about it:

The memset function copies the value of c (converted to an unsigned char) into each of the first n characters of the object pointed to by s.

Which suggests that there has to be an object in the destination (s) already existing there. In the context of C as well from my understanding an object is created when something like just a void * pointing at empty memory is casted to a pointer of another type and accessed, so wouldn't memset in C create a bunch of char objects which later would be incorrectly aliased to another type if casted to say float *? I think that might just be an issue with how C object creation works in its own standard being broken in that regard. I also assume the answer in C++ is "just placement new everything first", but it is hard to tell with how complex the standard gets when dealing with fundamental things like this.

Lemon Drop
  • 2,113
  • 2
  • 19
  • 34
  • `int * foo = static_cast(::operator new(sizeof(int))); *foo = 5;` would be fine (but doing `some_int = *foo` would have undefined behaviour, since `*foo` is not initialised). It would also be fine if a trivial type is used instead of `int`. Use a non-trivial class type instead of `int` (e.g. one that has a constructor which sets up class invariants) then the behaviour would be undefined. – Peter Jan 03 '20 at 00:20
  • @Peter I'd think that would be the case but I haven't really seen any distinction between how trivial types are handled versus something like a class, so it seems like it'd be undefined still since there is conceptually no object created there yet (until something like `new (foo) int{};` is done). – Lemon Drop Jan 03 '20 at 00:23
  • Related: https://stackoverflow.com/a/42295175/9171697 – ph3rin Jan 03 '20 at 01:04
  • 1
    In the original C++ standard, the definition was simple - an object IS a region of storage. In later standards, language lawyers have had field days, and made it more complicated (IMHO, over-complicated in this case). In C++17, the definition of object lifetime (Section 6.8 para 1) says that the lifetime of an object of type T starts when storage of needed alignment and size is obtained (e.g. from an `operator new()` function) AND if the object has non-vacuous initialisation, that initialisation is complete. The definition of non-vacuous only applies to class types. – Peter Jan 03 '20 at 01:05
  • My reading of [basic.life](http://eel.is/c++draft/basic.life#1) is they're trying to allow immediate use of trivially constructed types but haven't gotten the wording quite right. Must be some edge case I'm not seeing. – user4581301 Jan 03 '20 at 01:38

3 Answers3

2

Formally, the word "lifetime" itself has no meaning when there is no object.

[intro.object/1]: The constructs in a C++ program create, destroy, refer to, access, and manipulate objects. An object is created by a definition, by a new-expression, when implicitly changing the active member of a union, or when a temporary object is created ([conv.rval], [class.temporary]). An object occupies a region of storage in its period of construction ([class.cdtor]), throughout its lifetime, and in its period of destruction ([class.cdtor]).

Note that the example you have provided uses the global new operator, not the new-expression. Therefore, you have not created any object, let aside of its lifetime. You cannot memcpy, since there is no object.

ph3rin
  • 4,426
  • 1
  • 18
  • 42
  • Yeah that makes sense, I suppose it's not as direct as I was looking for but the fact that no lifetime is started in these cases does suggest that indeed no object exists and it cannot be assigned or memcpy'd into. – Lemon Drop Jan 03 '20 at 04:49
1
int * foo = static_cast<int *>(::operator new(sizeof(int)));
*foo = 5;

I feel based on that set of definitions it may be illegal without first calling placement new

It would indeed be undefined.

Additionally I am not sure if that cast is valid without actual int objects having been created there yet

Casting from void* to another pointer is always well defined as far as I know. Attempting to access a non-existing object - or object of incompatible type - through such reinterpreted pointer is not valid.

Similarly, would preforming this action through an "imaginary" object representation like so be valid:

void * foo = ::operator new(sizeof(int));
int bar = 5;
std::memcpy(foo, &bar, sizeof(int));

Also undefined.

how does something like std::memset work?

It sets the bit pattern of objects. For example:

int foo = 42;
std::memset(&foo, 0, sizeof foo);

All bytes of foo are now 0. No rule was violated.

std::memset doesn't create any objects just like std::malloc and std::memcpy don't.

TL;DR Use placement new if you want to create objects into allocated bare storage.


P.S There is a proposal (p0593rX) to introduce implicit creation of trivial objects into C++ which would allow similar patterns to what is allowed in C.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • That makes sense, I guess most my confusion on the final point was interpreting things like `std::memset` to be actually a for loop assigning to each memory address (well, in practice it is), meaning that that itself would be violating this (after all the `char`s it tries to assign to have not been created as objects). If it's to be thought of as a more abstract thing though then I suppose that makes a bit more sense, it just seemed a bit odd to me that you can't memset before objects have been created, but I suppose that is consistent with its usage in other cases. – Lemon Drop Jan 03 '20 at 04:51
  • @LemonDrop narrow character types are special and it is well defined to reinterpret bytes of all objects as chars. – eerorika Jan 03 '20 at 05:08
  • Well I know about how to get things into an object representation like that, but it just seems strange that something like `*(char*)malloc(1) = 0;` would be valid for example, but in the case of an integer it would not be. This is just under the assumption that memeset is indeed setting 0s to memory and not some special magic function that just doesn't go by those rules. – Lemon Drop Jan 03 '20 at 05:12
  • If `int * foo = static_cast(::operator new(sizeof(int)))` followed by a statement `*foo = 5;` gives undefined behaviour, can you explain how the casting the result of `malloc()` (for example one of `int *foo = static_castmalloc(sizeof(int))` or `int *foo = (int *)malloc(sizeof(int))` followed by a statement `*foo = 5`) can ever be used without undefined behaviour? – Peter Jan 03 '20 at 07:04
  • @Peter Use placement new to create an `int` object into the allocated memory. – eerorika Jan 03 '20 at 07:11
  • That's not what I asked. Since when has usage of placement new been essential to avoid undefined behavioiur when using memory returned by `malloc()`? – Peter Jan 03 '20 at 07:41
  • @Peter I don't have a copy of C++98, but in C++03 it appears to be necessary. – eerorika Jan 03 '20 at 16:08
1

I believe this is a question of lifetime. Given:

int * foo = static_cast<int *>(::operator new(sizeof(int)));

The question is, does *foo refer to a valid object. According to N3797 § 3.8 [basic.life], it does.

The lifetime of an object of type T begins when:
— storage with the proper alignment and size for type T is obtained, and

— if the object has non-trivial initialization, its initialization is complete.

Since int has trivial initialization, simply allocating correctly aligned storage for it begins the lifetime of a valid object of type int. It's certainly true that *foo has an indeterminate value at this point, so observing its value is worthless at best and undefined behavior at worst, but assigning to it is correct.

Similarly, your second example results in foo pointing to a valid object of type int. The subsequent memcpy operation is completely fine.

Matt Weber
  • 116
  • 4
  • My interpretation of that section of [basic.life] is that pertains more to like a variable, for example saying `int a;` the lifetime begins after it allocates some space to have that variable and after its initialization (in this case default initialization). Maybe that's misinterpreting it but I don't really get how the lifetime can begin if nothing really has been created there, e.g. with placement new or something. – Lemon Drop Jan 03 '20 at 04:46
  • The big issue seems to be that the use of the allocation function does not begin the lifetime of an `int` here. I agree that there are clear requirements for introducing an object that are not met, but I still think this is (mostly) valid. Sticking with [basic.life] but looking at a C++17 draft, N4659 § 6.8/6 states that a pointer to the storage location where an object will live can be used in limited ways, as can the lvalue obtained via indirection through that pointer. I think the example conforms with this. `static_cast`ing is ill-formed here, but that's the only undefined behavior I see. – Matt Weber Jan 04 '20 at 00:57