4

If std::uninitialized_copy is used to an initialized memory, does this use cause a memory leak or an undefined behavior?

For example:

std::vector<std::string> u = {"1", "2", "3"};
std::vector<std::string> v = {"4", "5", "6"};
// What happens to the original elements in v?
std::uninitialized_copy(u.begin(), u.end(), v.begin());
Jeongu Kim
  • 117
  • 6
  • 1
    I don't think the duplicate is accurate. It explains the difference between `std::copy` and `std::uninitialized_copy` but doesn't cover the case where the destination range is not uninitialized. It isn't clear in the duplicate if the copy will call the previous elements destructors or not. – François Andrieux Jun 25 '22 at 17:05
  • @FrançoisAndrieux The example implementation shown at https://en.cppreference.com/w/cpp/memory/ranges/uninitialized_copy 1) seems to indicate in place news are done without destructing what was there. Which kind of makes sense because the developer explicitly has to call unitialized_copy instead of copy. But I must admit it is not clearly specified as a requirement there – Pepijn Kramer Jun 25 '22 at 17:11
  • @FrançoisAndrieux The output range is specified as "*which is an uninitialized memory area*". So passing initialized memory violates the contract of the function and is UB. Doesn't matter what happens in that case. – Goswin von Brederlow Jun 25 '22 at 17:59
  • @GoswinvonBrederlow Is *"uninitialized memory area"* a formally defined concept in the first place? Hmm. – HolyBlackCat Jun 25 '22 at 18:00
  • @HolyBlackCat A grey area. In std::aligned_storage it's called "*uninitialized storage*". At least on https://en.cppreference.com/w/cpp/types/aligned_storage You would have to look up the exact wording in the specs for all the things that handle uninitialized memory to know for sure. Personally the informal description is enough for me to see it's not intended to be used with initialized objects in the output. – Goswin von Brederlow Jun 25 '22 at 18:07
  • @GoswinvonBrederlow It is unclear to me whether "uninitialized memory area" is normative. I don't think it is. Additionally, the requirement should be about whether or not the range contains class object, not about its state of initialization. For example, it seems intuitive that you should be able to call `std::unititialized_copy` on a range which was previously zeroed out with `std::memset`. Such a range is not unititialized, and it contains a number of instances of `unsigned char`, but it would be surprising if such a range couldn't be the destination range for `std::unititialized_copy`. – François Andrieux Jun 25 '22 at 18:19
  • @FrançoisAndrieux Indeed the definition of an uninitialized memory area is confusing; if an object is moved to another location (by move semantics) or destructed by its destructor, then is the area formerly occupied by the object uninitialized? – Jeongu Kim Jun 25 '22 at 18:25
  • 1
    @JeonguKim The formal term for what you are asking about in your comment is "storage" which is memory which may or may not be occupied by an object. Move semantics do not end the lifetime of a moved-from object, so the moved-from object still occupies its storage. It is just in a "moved-from" state, the meaning of which depends on the type of object, but it is still definitely in its lifetime and occupies its storage. If you explicitly call an object's destructor, that does end its lifetime and the storage it used to occupy becomes unoccupied by that object. – François Andrieux Jun 25 '22 at 18:43
  • @FrançoisAndrieux Now I am really clear! – Jeongu Kim Jun 25 '22 at 18:51
  • @FrançoisAndrieux zeroeing out a range with living objects that are not trivially destructible would be UB. But I would say calling the destructor on all objects within a memory range makes it uninitialized again. But for a language lawyer answer someone has to dig through the language specs. – Goswin von Brederlow Jun 25 '22 at 19:45

3 Answers3

7

TL;DR: Don't do this.

Assuming your std::string implementation uses SSO1 (all modern ones do, I believe), then this doesn't leak anything only if the strings are short enough to be SSO'ed.

Any possible UB here is governed by [basic.life]/5:

A program may end the lifetime of an object of class type without invoking the destructor, by reusing or releasing the storage as described above.

[Note 3: A delete-expression ([expr.delete]) invokes the destructor prior to releasing the storage. — end note]

In this case, the destructor is not implicitly invoked and any program that depends on the side effects produced by the destructor has undefined behavior.

It's not entirely clear to me what "depending on side effects" entails (can you solemnly declare that you don't mind the lack of side effects, and get rid of UB this way?), but destroying a SSO'ed string should have no side effects.

But! If you enable iterator debugging, then the destructor might get side effects regardless of SSO (to somehow notify the iterators that they should be invalidated). Then skipping the destructor might be problematic.


1 SSO = small (or short) string optimization = not allocating a short string on the heap, and instead embedding it directly into the std::string instance.

HolyBlackCat
  • 78,603
  • 9
  • 131
  • 207
  • Even w/o short string optimization the only thing that happens is disconnected memory holding the strings. Since that memory will be released at program termination it's not UB any more than not deleting a new. But is is poor programming. – doug Jun 25 '22 at 18:28
  • @doug Don't forget about iterator debugging. I'd be worried about that. – HolyBlackCat Jun 25 '22 at 19:00
  • Yeah. Debugging could get, well, interesting! While an interesting question, and I think your answer enhances understanding of c++'s darker corners, I sure hope people don't do this in practice. – doug Jun 25 '22 at 19:37
4

Base on specialized.algorithms.general and uninitialized.copy I don't find any formal requirements that the destination range must be uninitialized. As such, we can consider the effect of std::uninitialized_copy which is defined as equivalent to the follow (excluding the implied exception safety boilerplate code) :

for (; first != last; ++result, (void) ++first)
  ::new (voidify(*result))
    typename iterator_traits<NoThrowForwardIterator>::value_type(*first);

We can conclude that std::uninitialized_copy does not call any destructor or otherwise cares about what was previously located in the destination range. It simply overwrites it, assuming it is uninitialized.

To figure out what this means in terms of correctness, we can refer to basic.life

A program may end the lifetime of an object of class type without invoking the destructor, by reusing or releasing the storage as described above. In this case, the destructor is not implicitly invoked and any program that depends on the side effects produced by the destructor has undefined behavior.

This however uses the loosely defined notion of "any program that depends on the side effects of". What does it mean to "depend on the side effects of"?

If your question was about overwriting vectors of char or int there would be no problem, as these are not class types and do not have destructors so there can be no side effects to depend on. However std::string's destructor may have the effect of releasing resources. std::basic_string may have addition, more directly observable side effects if a user defined allocator is used. Note that in the case of a range containing non-class type elements std::uninitialized_copy is not required. These elements allow for vacuous initialization and simply copying them to uninitialized storage with std::copy is fine.

Since I don't believe it is possible for the behavior of a program to depend on the release of std::string's resources, I believe the code above is correct in terms of having well defined behavior, though it may leak resources. An argument could be made that the behavior might rely on std::bad_alloc eventually being thrown, but std::string isn't strictly speaking required to dynamically allocate. However, if the type of element used had side effects which could influence the behavior, and the program depended on those effects, then it would be UB.

In general, while it may be well defined in some cases, the code shown violates assumptions on which RAII is based, which is a fundamental feature most real programs depend on. On these grounds std::uninitialized_copy should not be used to copy to a range which already contains objects of class type.

François Andrieux
  • 28,148
  • 6
  • 56
  • 87
3

Uninitialized memory in C++ is a memory that contain no valid object(s), is obtained by call to std::get_temporary_buffer, std::aligned_storage, std::aligned_alloc, std::malloc or similar functions.

There is no way to determine if a supplied memory is initialized with objects. Developers must care of it. The compiler sanitizers can do it, but their memory attributes are not available to a program.

std::uninitialized_copy expects uninitialized memory and makes no other assumptions. Giving an initialized memory to it may result in memory leaks and undefined behavior.

Here the specialized memory algorithm uninitialized_­copy in Algorithms library in the Standard:

template<class InputIterator, class NoThrowForwardIterator>
  NoThrowForwardIterator uninitialized_copy(InputIterator first, InputIterator last,
                                            NoThrowForwardIterator result);
  1. Preconditions: result +[0, (last - first)) does not overlap with [first, last).
  2. Effects: Equivalent to:
    for (; first != last; ++result, (void) ++first)
      ::new (voidify(*result))
        typename iterator_traits<NoThrowForwardIterator>::value_type(*first);
    
  3. Returns: result.
273K
  • 29,503
  • 10
  • 41
  • 64