8

Title says it.

Sample of bad practive:

std::vector<Point>* FindPoints()
{
   std::vector<Point>* result = new std::vector<Point>();
   //...
   return result;
}

What's wrong with it if I delete that vector later?

I mostly program in C#, so this problem is not very clear for me in C++ context.

Andrey
  • 59,039
  • 12
  • 119
  • 163

6 Answers6

12

As a rule of thumb, you don't do this because the less you allocate on the heap, the less you risk leaking memory. :)

std::vector is useful also because it automatically manages the memory used for the vector in RAII fashion; by allocating it on the heap now you require an explicit deallocation (with delete result) to avoid leaking its memory. The thing is made complicated because of exceptions, that can alter your return path and skip any delete you put on the way. (In C# you don't have such problems because inaccessible memory is just recalled periodically by the garbage collector)

If you want to return an STL container you have several choices:

  • just return it by value; in theory you should incur in a copy-penality because of the temporaries that are created in the process of returning result, but newer compilers should be able to elide the copy using NRVO1. There may also be std::vector implementations that implement copy-on-write optimization like many std::string implementations do, but I've never heard about that.

    On C++0x compilers, instead, the move semantics should trigger, avoiding any copy.

  • Store the pointer of result in an ownership-transferring smart pointer like std::auto_ptr (or std::unique_ptr in C++0x), and also change the return type of your function to std::auto_ptr<std::vector<Point > >; in that way, your pointer is always encapsulated in a stack-object, that is automatically destroyed when the function exits (in any way), and destroys the vector if its still owned by it. Also, it's completely clear who owns the returned object.

  • Make the result vector a parameter passed by reference by the caller, and fill that one instead of returning a new vector.

  • Hardcore STL option: you would instead provide your data as iterators; the client code would then use std::copy+std::back_inserter or whatever to store such data in whichever container it wants. Not seen much (it can be tricky to code right) but it's worth mentioning.


  1. As @Steve Jessop pointed out in the comments, NRVO works completely only if the return value is used directly to initialize a variable in the calling method; otherwise, it would still be able to elide the construction of the temporary return value, but the assignment operator for the variable to which the return value is assigned could still be called (see @Steve Jessop's comments for details).
Community
  • 1
  • 1
Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • thanks for great answer! personally I prefer second option, but this makes types awfully long, so you have to use `typedef`s or `define`s, which I also don't like much :) – Andrey Feb 21 '11 at 00:07
  • In this case, it is not a good idea to use std::auto_ptr. They don't play nicely with STL containers. –  Feb 21 '11 at 00:08
  • @John Gaughan they don't if you put `auto_ptr` *inside* container, but it is not the case – Andrey Feb 21 '11 at 00:10
  • 2
    @John: I know that you shouldn't use `std::auto_ptr` as *elements* of STL containers, but I don't see why you shouldn't use them to manage the lifetime of an STL container allocated on the heap. – Matteo Italia Feb 21 '11 at 00:10
  • @Andrey: thank you. I too find that `auto_ptr`-based types are often waaay too long to type (and read!) but if there are no alternatives... :) – Matteo Italia Feb 21 '11 at 00:12
  • Sorry, I was looking at that part backwards apparently. Yes, it is safe to put STL containers in `auto_ptr`s, but not the other way around. –  Feb 21 '11 at 00:13
  • Added "hardcore STL" iterators option. – Matteo Italia Feb 21 '11 at 00:18
  • 2
    "newer compilers should be able to elide the copy using NRVO" - only if the result is used to initialize a variable. If it's assigned to a variable that previously exists then copy constructor elision can get rid of the temporary, but there's still an assignment which can't be elided. C++0x move semantics solve that case too (by making the assignment cheap). So with C++0x there is no longer much reason to avoid returning standard containers by value, but with C++03 there is that awkward case involving a full copy. – Steve Jessop Feb 21 '11 at 01:01
  • @Matteo: although I say, "can't be elided", I suppose I mean "can't *necessarily* be elided". Standard containers will no doubt have assignment implemented by copy-and-swap, and they're templates so everything can be inlined, to the point where the compiler could in principle deduce that the copy is side-effect-free and hence can be removed under the "as-if" rule, never mind copy ctor elision. I don't know what compilers can put all that together and come up with an efficient answer in the case of using a return value as the RHS of an assignment. – Steve Jessop Feb 21 '11 at 01:11
  • @Matteo: "On C++0x compilers, instead, the move semantics should trigger, avoiding any copy" -> not sure about that one, I think that copy elision is preferred to move if possible, because the compiler cannot estimate (in general) the cost of moving. – Matthieu M. Feb 21 '11 at 07:40
6

Creating anything dynamically is bad practice unless it's really necessary. There's rarely a good reason to create a container dynamically, so it's usually not a good idea.

Edit: Usually, instead of worrying about things like how fast or slow returning a container is, most of the code should deal only with an iterator (or two) into the container.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • You can almost totally avoid using dynamic allocation in code, but it will copy everything at every call. I thought that this is not that good for performance. – Andrey Feb 21 '11 at 00:05
  • 1
    Creating objects dynamically is perfectly fine, even containers. The issue here is returning that pointer -- it opens up the program to a whole new class of bugs that can be difficult to detect. –  Feb 21 '11 at 00:05
  • @Andrey: If you pass something by reference, it won't be copied either. If you don't know it, "Accelerated C++" is a great C++ book, and it doesn't touch pointers until chapter 10, that should tell you something ;) – etarion Feb 21 '11 at 00:10
  • @etarion I know about the references, but you can't return reference to local variable is not good idea either. – Andrey Feb 21 '11 at 00:15
  • 1
    @Andrey: When "at every call", i assumed you meant passing arguments. For returning values, there's RVO, which about every modern compiler uses. – etarion Feb 21 '11 at 00:17
  • @Jerry Coffin Well, your question contains no useful information, question was "Why?" and your answer is just restated questions. Could you rewrite my sample according to best practices? – Andrey Feb 21 '11 at 00:19
  • @etarion I didn't know about RVO, that is interesting stuff. – Andrey Feb 21 '11 at 00:20
  • @Andrey: My answer stated a general rule, of which this is a specific case. – Jerry Coffin Feb 21 '11 at 01:33
2

Creating objects dynamically in general is considered a bad practice in C++. What if an exception is thrown from your "//..." code? You'll never be able to delete the object. It is easier and safer to simply do:

std::vector<Point> FindPoints()
{
  std::vector<Point> result;
  //...
  return result;
} 

Shorter, safer, more straghtforward... As for the performance, modern compilers will optimize away the copy on return and if they are not able to, move constructors will get executed so this is still a cheap operation.

Nemanja Trifunovic
  • 24,346
  • 3
  • 50
  • 88
  • 1
    I wouldn't say "in general". `std::unique_ptr c(new C);` is fine. It's "dumb pointers" that are the problem. – etarion Feb 21 '11 at 00:03
  • The whole underlying container will be copied upon return. I can hardly believe that it is cheap operation. Still I totally agree with exception problem, but it can be solved using `auto_ptr` – Andrey Feb 21 '11 at 00:04
  • Move constructors only exist with recent compilers with C++0x extensions turned on. In general, if the creation and return of the object occur in one sequence point, compilers will optimize the copy constructions away. –  Feb 21 '11 at 00:04
  • 1
    @Andrey: Don't use `auto_ptr`. That _really_ is bad practice. And no, it won't be copied. Google for named return value optimization. – etarion Feb 21 '11 at 00:05
  • @etarion it is completely new subject, but what is wrong for `auto_ptr` especially in simple cases (like this one) – Andrey Feb 21 '11 at 00:09
  • @Andrey: Like "dumb" pointers, it opens up a new class of bugs. Just not using it means you don't have to worry about it. Inside this function, it would have been fine probably - unless you'd pass the `auto_ptr` to another function inside it (e.g. a submethod finding _one_ point), which would invalidate your pointer, meaning the next access to it is UB ... and whatever you return it to also needs to worry that it never copies it by accident, which is just a headache. – etarion Feb 21 '11 at 00:16
  • @etarion that is controllable situation. I know how `auto_ptr` behaves, so it is not hard to avoid unintended invalidation – Andrey Feb 21 '11 at 00:21
  • @Andrey: I'll just say "famous last words". Many have thought that and got bitten. – etarion Feb 21 '11 at 00:26
  • @Andrey: The way you wrote it, the object won't be copied - as others pointed out search "NRVO". If you think that would be a problem, pass the vector as an "out" reference to the function. – Nemanja Trifunovic Feb 21 '11 at 00:34
  • @Nemanja Trifunovic yeah, I already discovered miracle of NRVO – Andrey Feb 21 '11 at 00:37
  • @etarion: Of course, dynamic creation of object is needed in some cases (i.e. for run-time polymorphism) and unique_ptr surely helps. However creating an object on the stack, when possible, is simpler and easier to type. – Nemanja Trifunovic Feb 21 '11 at 00:37
  • @Nemanja Trifunovic: Polymorphism requires a pointer or reference -- but that does *not* necessarily imply dynamic creation of objects. – Jerry Coffin Feb 21 '11 at 01:43
0

Perhaps you're referring to this recent question: C++: vector<string> *args = new vector<string>(); causes SIGABRT

One liner: It's bad practice because it's a pattern that's prone to memory leaks.

You're forcing the caller to accept dynamic allocation and take charge of its lifetime. It's ambiguous from the declaration whether the pointer returned is a static buffer, a buffer owned by some other API (or object), or a buffer that's now owned by the caller. You should avoid this pattern in any language (including plain C) unless it's clear from the function name what's going on (e.g strdup, malloc).

The usual way is to instead do this:

void FindPoints(std::vector<Point>* ret) {
   std::vector<Point> result;
   //...
   ret->swap(result);
}

void caller() {
  //...
  std::vector<Point> foo;
  FindPoints(&foo);
  // foo deletes itself
}

All objects are on the stack, and all the deletion is taken care of by the compiler. Or just return by value, if you're running a C++0x compiler+STL, or don't mind the copy.

Community
  • 1
  • 1
John Ripley
  • 4,434
  • 1
  • 21
  • 17
  • Oh and not just memory leaks: what if another function looked like FindPoints() but returned a pointer to a buffer that isn't allocted just for the caller? The caller then mistakenly deletes it, and you get a double delete and heap corruption. – John Ripley Feb 21 '11 at 00:10
  • yes, that question inspired me :) well, that *usual* way looks slightly odd, because it messes return values and parameters – Andrey Feb 21 '11 at 00:11
  • It's standard practice ("Effective C++" style anyway) to infer arguments are return values by passing as non-const pointers. It's an efficient return mechanism on both old and C++0x compilers. – John Ripley Feb 21 '11 at 00:18
0

I like Jerry Coffin's answer. Additionally, if you want to avoid returning a copy, consider passing the result container as a reference, and the swap() method may be needed sometimes.

void FindPoints(std::vector<Point> &points)
{
    std::vector<Point> result;
    //...
    result.swap(points);
}
albert
  • 364
  • 3
  • 8
0

Programming is the art of finding good compromises. Dynamically allocated memory can have some place of course, and I can even think to problems where a good compromise between code complexity and efficiency is obtained using std::vector<std::vector<T>*>.

However std::vector does a great job of hiding most needs of dynamically allocated arrays, and managed pointers are many times just a perfect solution for dynamically allocated single instances. This means that it's just not so common finding cases where an unmanaged dynamically allocated container (or dynamically allocated whatever, actually) is the best compromise in C++.

This in my opinion doesn't make dynamic allocation "bad", but just "suspect" if you see it in code, because there's an high probability that better solutions could be possile.

In your case for example I see no reason for using dynamic allocation; just making the function returning an std::vector would be efficient and safe. With any decent compiler Return Value Optimization will be used when assigning to a newly declared vector, and if you need to assign the result to an existing vector you can still do something like:

FindPoints().swap(myvector);

that will not do any copying of the data but just some pointer twiddling (note that you cannot use the apparently more natural myvector.swap(FindPoints()) because of a C++ rule that is sometimes annoying that forbids passing temporaries as non-const references).

In my experience the biggest source of needs of dynamically allocated objects are complex data structures where the same instance can be reached using multiple access paths (e.g. instances are at the same time both in a doubly linked list and indexed by a map). In the standard library containers are always the only owner of the contained objects (C++ is a copy semantic language) so it may be difficult to implement those solutions efficiently without the pointer and dynamic allocation concept.

Often you can stil reasonable-enough compromises that just use standard containers however (may be paying some extra O(log N) lookups that you could have avoided) and that, considering the much simpler code, can be IMO the best compromise in most cases.

6502
  • 112,025
  • 15
  • 165
  • 265