0

I call the following void function() function every 40ms and I found that the memory consumption increase steadily. The consumption is not obvious at first but after days, the consumption is huge. Can anyone help to explain what is wrong with this code. Is it a thread problem or std::move problem that caused the memory leak.

void do_task(const std::vector<int>& tmp)
{
    // do some work here
}

void function()
{
    std::vector<std::thread> task;
    std::vector<int> tmp1, tmp2;

    GetTempValue(tmp1);
    GetTempValue(tmp2);

    task.push_back(std::thread(do_task, std::move(tmp1)));
    task.push_back(std::thread(do_task, std::move(tmp2)));

    tmp1.clear();
    tmp2.clear();

    UpdateTempValue(tmp1);
    UpdateTempValue(tmp2);

    task.push_back(std::thread(do_task, std::move(tmp1)));
    task.push_back(std::thread(do_task, std::move(tmp2)));

    tmp1.clear();
    tmp2.clear();

    for(int i=0; i<task.size(); i++)
    {
        task[i].join();
    }
}
Johnnylin
  • 507
  • 2
  • 7
  • 26
  • I don't suppose you sent this through Valgrind? Regardless, if move-semantics are doing their job, those clears are pointless. Out of curiosity please update your question with your platform toolchain info, and ideally an [mcve](https://stackoverflow.com/help/mcve). – WhozCraig Aug 13 '17 at 04:08
  • 1
    @rxu. There cannot be a data race issue with code above. The vectors tmp1 and tmp2 are handled quite correctly. Except for the fact that the calls to clear() are redundant. – Michaël Roy Aug 13 '17 at 04:11
  • Sorry for the noise : ) – hamster on wheels Aug 13 '17 at 04:12
  • I would take a harder look into do_task(). What are the OS and compiler used here? – Michaël Roy Aug 13 '17 at 04:12
  • Wait a minute!!! Shouldn't do_task receive a std::vector, and not a reference? – Michaël Roy Aug 13 '17 at 04:15
  • @MichaëlRoy It *should*. That's certainly how I would have done it were I relying on move-semantics. Strangely, even without it (as-is, in other words), I cannot reproduce the OP's problem, where the GetTemp and UpdateTemp functions substitutions (due to no mcve) simply load up 10MB vectors. – WhozCraig Aug 13 '17 at 04:17
  • And of course, what are you doing the other 39.xx seconds per iteration when you're not invoking *this* function? Are you saying trimming this out leaves you no "leak" ?And what are you using to measure your "leak" ? Hopefully something more reliable than Windows task manager. – WhozCraig Aug 13 '17 at 04:19
  • That's milliseconds... An average of 1 new thread every 10 millisecond... Even if only 1 word escapes per thread that would be 400 bytes/second, or 1.4 Mbyte/hour. My initial math was off, sorry,. That's still a lot. – Michaël Roy Aug 13 '17 at 04:22
  • @M ah. thx for the clarification. – WhozCraig Aug 13 '17 at 04:23
  • what if you call the program without `std::move` operation, as the program logic is wrong with it. As after the first move operation clear is already cleared and update function does something on the cleared array. Also why move if you are binding it to const ref? – Hakes Aug 13 '17 at 04:37
  • Are you sure you want to use `tmp1` and `tmp2` after you `std::move`'d them? They might already be waiting for destruction. – Kamajii Aug 13 '17 at 05:27
  • @Kamajii i think it can be reused. the clear operation is redundant . – Johnnylin Aug 13 '17 at 05:30
  • @Johnnylin IMHO this is bad practice. The moved-from object is left in an unspecified state after the move, according to STD. – Kamajii Aug 13 '17 at 06:25

2 Answers2

1

Passing a reference to a thread is a big no-no. Or at least a bug waiting to happen...

You should try redefining do task to accept a std::vector by value. Isn't that what you were trying to do anyway by calling std::move to pass them to the threads?

Change the definition of do_task to:

void do_task(std::vector<int> tmp);

I've made some quick calculations. If the threads started by function() leak, by starting 4 threads every 40ms, you should have a leak rate of over 1.4MB/hour. If your leak is less than that, you should start looking elsewhere in your code.

In any case, starting 100 threads per second is not really efficient. A big chunk of your computing power is lost in creating threads. Have you considered other options?

Having 4 threads running endless loops, and queuing the work to them will be way more efficient, less taxing on the OS, and less prone to leaks that you cannot control.

Michaël Roy
  • 6,338
  • 1
  • 15
  • 19
  • that makes sense. i start 6 threads every 40ms and do_task is a optical character recognition function. So why does it leak? – Johnnylin Aug 13 '17 at 04:36
  • recognition part may be leaking are you using tesseract? – Hakes Aug 13 '17 at 04:38
  • @Hakes the ocr program is not leaking and the vector passed to do_task is just a look up table like purpose. – Johnnylin Aug 13 '17 at 04:39
  • @MichaëlRoy can you elaborate the cause of this leaking problem. i will adopt this as an answer. what we have here is how to fix it. – Johnnylin Aug 13 '17 at 04:42
  • Passing parameters to a thread by reference is OK; see [this answer](https://stackoverflow.com/a/5116923/5231607). – 1201ProgramAlarm Aug 13 '17 at 04:47
  • @Johnnylin I think your app suffers from a design isuue. – Michaël Roy Aug 13 '17 at 04:50
  • @1201ProgramAlarm yes. i think it is a creating thread problem. – Johnnylin Aug 13 '17 at 04:51
  • @1201ProgramAlarm: Having the thread function accept a reference as an argument is never a good idea. – Michaël Roy Aug 13 '17 at 04:56
  • Passing a reference is not a problem. `std::thread` passes the argument to the thread function via `decay_copy`, so you get a new `T` in any case (you have to use `std::ref` if you really wanted a reference). In other words, in OP code `&tmp != &tmp1 && &tmp != &tmp2` for all the `tmp`s. – sbabbi Aug 13 '17 at 21:11
-1

I think openmp may be interresting. OpenMP is an API for developing parallel applications.

https://en.m.wikipedia.org/wiki/OpenMP