2

I have a real time 3d program using OpenFrameworks that runs easily at 60 fps under normal conditions on a single thread. I have a 2nd auxiliary thread that while working causes my main thread update rate to drop and stall intermittently.

In order to debug the situation I have tried to isolate the issue by the following:

  1. The main thread does relatively little work in my test conditions, and achieves 2000fps when the auxiliary thread is running.
  2. The second thread runs in a continuous loop and tries to load a large (20Meg) xml file via the following

    while (true) 
    {
        ofXml test;
        test.load("bigfile.xml");
    }
    
  3. There are no programmatically shared resources between the main thread and auxiliary thread and there are no locks/mutexes in either thread.

Profiling seems to give me one possible answer, it looks like heap allocations and deallocations on the auxiliary thread are possibly swamping the heap allocator and therefore the main threads allocations/deallocations are being slowed down.

I used a sampling profiler ",very sleepy", to profile my main thread for 30 seconds first while the auxiliary thread is suspended, and then resumed.

When running with the auxiliary thread suspended , top exclusive% functions are as follows, and a look at stack of the the time taken for new:

aux thread suspended top exclusive in main thread

aux thread suspended top stack for new in main thread

And now the same running with the auxiliary thread active:

aux thread active top exclusive in main thread

aux thread active top stack for new in main thread

This seems to suggest that std::string allocations are being slowed by heap allocations in the auxiliary thread and may be what is causing the slowdown and stalls. Does that sound correct? , or could i be missing something. I want to make sure my analysis is correct before I try to rewrite my code without std::string.

Chnossos
  • 9,971
  • 4
  • 28
  • 40
skimon
  • 1,149
  • 2
  • 13
  • 26
  • Perhaps your code using std::string is inefficient. Pass by const reference where possible, reuse string instances where possible, prefer += to concatenate strings instead of + or stringstream. – Neil Kirk Jul 09 '14 at 21:59
  • @neil Its possible, however im less concerned by that since without the second thread the performance is excellent. according to the profiler the time spent as a percentage of the running time increases from about 30% to 60% – skimon Jul 09 '14 at 22:37
  • First of all: Are you testing this in debug or release build? If not release build, do that. And yes, the heap is probably, at some point during the calls, a resource that can only be accessed by one thread at a time. It could also be other factors. Without studying the code for your framework for xml parsing, who knows... – Mats Petersson Jul 09 '14 at 22:47
  • you could also try out a different allocator, such as tcmalloc. – milianw Jul 09 '14 at 22:58
  • @skimon Why aren't you concerned? Less heap usage = faster, right? – Neil Kirk Jul 09 '14 at 23:17
  • @mats Yep these measurements are for Release mode. – skimon Jul 09 '14 at 23:41

1 Answers1

0

Since it allows you to see individual call stacks on the auxiliary thread, those are what to look at. For example, the one you are showing says over 20% of its time is in the process of re-allocating strings, probably for the purpose of growing them. If you look at other stack samples as well, I bet they are also in the process of growing strings (just not in new). If so, it means an even higher percent is spent in growing strings. If so, anything you can do to increase default string allocations or grow-by size should help.

(BTW, you can see from this that exclusive time is not helpful. Inclusive time is what you need, but even that is not as useful as simply examining stack samples so you can see why it's doing what it's doing. That's why this technique is so effective. )

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135