3

I've spent the last week porting a recursive Branch&Cut algorithm from Matlab to C++ and hoped to see a significant decrease in solution time, but as incredible as it may sound, the opposite was the case. Now I am not really an expert on C++, so I downloaded sleepy profiler and tried to find potential bottlenecks. I'd like to ask if I am drawing the right conclusions from this, or if I am looking in a completely wrong direction.

I let the code run for 137 seconds, and this is what the profiler displays (many other entries below, but they don't matter):

enter image description here

So if I get this right, 98 seconds were spent creating new objects, and 34 seconds were spent freeing up memory (i.e. deleting objects).

I will go through my code and look where I could do things better, but I also wanted to ask if you have any hints on frequent mistakes or bad habits that generate such behavior. The one thing that comes to my mind is that I use a lot of temporary std::vectors in my code to compute stuff, so that might be slow.

To keep you from going blind, I won't post my code before I looked it through a bit, but if I can't solve this on my own I'll be back.

Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
Christoph
  • 59
  • 3
  • This isn't a real question. There's no way to say what's going on without seeing code. But generally, novice C++ programmers often write slow programs because they don't understand the value semantics and copy-constructor semantics of C++. You need to use references or move semantics whenever possible to avoid unnecessary copying of vectors/strings, as well as `vector::reserve` to reduce the amount of reallocations. – Charles Salvia Jan 25 '13 at 14:05
  • I'm still a bit new regarding C++ but I think allocating temporary space on the stack will help a lot. – ChaosPandion Jan 25 '13 at 14:05
  • Are you profiling a debug build? You should be profiling with all optimisations on. – Peter Wood Jan 25 '13 at 14:11
  • It's a release build and Visual Studio is set to "full optimization", so it's a problem of my code I'm afraid. But thanks for your comments, I see that I am on the right track! – Christoph Jan 25 '13 at 14:17
  • How are you passing your vectors (especially the large ones) between your functions? By value: `void function(vector a)`, by reference: `void function(vector &a)` or by pointer: `void function(vector *a)` resp. `void function(shared_ptr a)` ? – us2012 Jan 25 '13 at 14:22
  • By reference, seemed the easiest to me. Is there a huge difference between passing by reference and passing by pointer? – Christoph Jan 25 '13 at 14:30
  • No, I was guessing that you were passing by value. By ref should be fine. Have you checked that your code is correct, even if it's slow? I.e., does your code finish and give correct answers on small integer LP problems? – us2012 Jan 25 '13 at 14:35
  • @Christoph Yes, references cannot be null. – Alex Chamberlain Jan 25 '13 at 14:42
  • 1) Try the [*random pause*](http://stackoverflow.com/a/378024/23771) method. 2) Do it under the debugger and use a debug build. Some assistant professors somewhere are telling people you should only profile with optimizations turned on, and that's academic toy-program nonsense. 3) It looks like your first big problem is too much new-ing and free-ing. The first few pause samples will tell you exactly which statements are responsible, and you can either move them out of inner loops or pool used objects. – Mike Dunlavey Jan 25 '13 at 17:44

4 Answers4

5

Yes, std::vectors can be expensive if misused. The biggest performance hit can be re-allocation - as the size needs to be dynamically adjustable and you have the constraints the elements must be in continuous memory, re-allocations happen whenever you add new elements beyond what is already allocated.

That's why you should either declare the size beforehand. If you know you'll have to hold n elements, declare it as

 std::vector<MyClass> x(n);

or

 std::vector<MyClass> x;
 x.reserve(n);

instead of just

 std::vector<MyClass> x;

followed by n push_backs.

If after this it's still slow, you can provide std::vector with a custom allocator. I hope you don't get to that point.

Luchian Grigore
  • 253,575
  • 64
  • 457
  • 625
  • I am declaring the size beforehand, so that's some pitfall I avoided :) – Christoph Jan 25 '13 at 14:13
  • If the automatic resize always doubles the capacity, at vector size M at most 2M copies (and at most log2(M) reallocations) have been performed, substantially fewer given any reasonable initial capacity. At any multiplier B, copies/elements at any vector size is at most B/{B-1}, so at 1.5x it's at most three, at 1.25x at most 5.Unless copies are remarkably expensive, this is down in the noise, not worth worrying about. – jthill Jan 25 '13 at 17:28
1

In Matlab, everything is a reference to an object. So, when you pass them around, you are doing the equivalent of copying a pointer (approximately int in size depending on a few factors), as opposed to copying an entire matrix, which is probably a magnitude bigger.

Without seeing any code, I can't say for sure, but I suspect you are copying loads of objects instead of copying references to them. I suggest you look into smart pointers, such as std::shared_ptr.

You didn't make it clear, but you should compile with optimisations. (g++ -O3.) Some expensive copies and other operations can be optimised out, but not all.

Furthermore, if you are new to C++, you shouldn't be using new. It is for experts, and is only to be used after a discussion with a colleague and a strong cup of coffee. (Of course, new may be used on your behalf by some containers, such as std::vector.)

Alex Chamberlain
  • 4,147
  • 2
  • 22
  • 49
  • He's probably not directly using `new`, but `std::vector` is calling `new` internally. – Charles Salvia Jan 25 '13 at 14:09
  • 1
    You are being a bit dramatic asserting that new should only be used by experts. With a bit of thought one can easily use new in a safe and performant manner. – ChaosPandion Jan 25 '13 at 14:12
  • 1
    I am not using new myself luckily. I also pass vectors by reference to functions, that's one thing I already found out :) – Christoph Jan 25 '13 at 14:16
  • @ChaosPandion I actually disagree with you quite strongly. New users can now use `std::make_shared` to instantiate objects on the heap and a well designed API can remove that requirement completely. There's just no need for beginners to use it. – Alex Chamberlain Jan 25 '13 at 14:16
  • @Charles Salvia - You are not only guessing about the code you don't see but also about what is hidden behind (profiler mentions new and free but you guess that std::vector being used instead). – SChepurin Jan 25 '13 at 14:19
  • @SChepurin, the OP specifically said he's using `std::vector`. But yeah, without code it's all a guess. – Charles Salvia Jan 25 '13 at 14:21
  • Well just to be sure: if I write std::vector test(10), somewhere new is called to create this vector - and if I do this e.g. a lot of times inside of a loop, this would use a lot of time and show up in the profiler like it does now. – Christoph Jan 25 '13 at 14:24
  • @Christoph Yes it would. Stick it outside the loop... – Alex Chamberlain Jan 25 '13 at 14:27
  • @Christoph - Where is this magic (10) comes from? If you know the size then maybe you don't need dynamic allocation and an array will be enough? And much faster. – SChepurin Jan 25 '13 at 14:33
  • @Christoph - But if you prefer a safety of std::vector then "It is a good idea to use reserve if you're doing a lot of inserting and you have good knowledge of the size of what you're storing. It's much faster." (see http://stackoverflow.com/questions/2761570/c-stdvector-memory-allocation) – SChepurin Jan 25 '13 at 14:39
  • @Christoph Or use a `std::array`. – Alex Chamberlain Jan 25 '13 at 14:41
  • The 10 in my example was just magic, in my code the size unfortunately comes from a variable. But I will see if I can change it up a bit and maybe introduce arrays at certain points, thanks for the suggestion. I have to go now unfortunately, but I'll try to improve my code and report back when something changes! – Christoph Jan 25 '13 at 15:09
0

As Luchian Grigore stated, std::vector is expensive. But not just that, any part of your code that constantly creates objects, moves data, reallocates memory or even deletes it can be time consuming. If what Alex Chamberlain says is true, there is the big bottleneck. If not, I'd remember you (or tell you if you didn't know it) that almost every thing you do for make your software cheaper in memory consumption, it's going to be paid in performance. And the opposite is also true, more memory consumption means the cpu has less to compute because you make a cache of everything computed before. It's basic but sometimes we forget it.

Adrián Pérez
  • 2,186
  • 4
  • 21
  • 33
-1

avoid calling new & delete on time critical code. It's slow. Or use a fast memory manager (e.g. SmartHeap).

Tobias Langner
  • 10,634
  • 6
  • 46
  • 76