9

I have some "FreeOnTerminate" worker threads which add their handles on a TThreadList when they start executing and remove from the same when their execution ends. They also check on a global event object that would notify them to cancel their work.

Following is the part that runs in the main thread which signals the event and waits for possible worker threads to end. WorkerHandleList is the global ThreadList.

...

procedure WaitForWorkers;
var
  ThreadHandleList: TList;
begin
  ThreadHandleList := TWorkerThread.WorkerHandleList.LockList;
  TWorkerThread.WorkerHandleList.UnlockList;
  WaitForMultipleObjects(ThreadHandleList.Count,
      PWOHandleArray(ThreadHandleList.List), True, INFINITE);
end;

initialization
  TWorkerThread.RecallAllWorkers := TEvent.Create;
  TWorkerThread.WorkerHandleList := TThreadList.Create;

finalization
  TWorkerThread.RecallAllWorkers.SetEvent;
  WaitForWorkers;

  TWorkerThread.RecallAllWorkers.Free;
  TWorkerThread.WorkerHandleList.Free;


This design, I think, has a flaw in that I have to unlock the list just before waiting on the threads' handles because that would cause a deadlock since the threads themselves remove their handles from the same list. Without any lock, a context switch could cause a thread to free itself causing WaitForMultipleObjects to return immediately with WAIT_FAILED. I can't employ another lock either since WaitForMultipleObjects is blocking and I wouldn't be able to release the lock from the main thread.

I can modify this design in a number ways including not using FreeOnTerminate threads, which would guarantee valid handles until they are explicitly freed. Or modifying the list of thread handles only from the main thread. Or probably others...

But what I want to ask is, is there a solution to this kind of problem without changing the design? For instance, would sleeping in worker thread code before they remove their handles from the list, or calling SwitchToThread cause all non-worker threads have a run? Enough run?

Sertac Akyuz
  • 54,131
  • 4
  • 102
  • 169
  • Here a [related question](http://stackoverflow.com/questions/8252804/when-to-free-a-thread-manually/8254173#8254173) (which I'll need to rephrase for you here). – NGLN Sep 08 '16 at 04:34

2 Answers2

10

Your use of LockList() is wrong, and dangerous. As soon as you call UnlockList(), the TList is no longer protected, and will be modified as the worker threads remove themselves from the list. That can happen before you have a chance to call WaitForMultipleObjects(), or worse WHILE setting up the call stack for it.

What you need to do instead is lock the list, copy the handles to a local array, unlock the list, and then wait on the array. DO NOT wait on the TList itself directly.

procedure WaitForWorkers;
var
  ThreadHandleList: TList;
  ThreadHandleArr: array of THandle;
begin
  ThreadHandleList := TWorkerThread.WorkerHandleList.LockList;
  try
    SetLength(ThreadHandleArr, ThreadHandleList.Count);
    for I := 0 to ThreadHandleList.Count-1 do
      ThreadHandleArr[i] := ThreadHandleList[i];
  finally
    TWorkerThread.WorkerHandleList.UnlockList;
  end;

  WaitForMultipleObjects(Length(ThreadHandleArr), PWOHandleArray(ThreadHandleArr), True, INFINITE);
end;

However, even that has a race condition. Some of the worker threads may have already terminated, and thus destroyed their handles, before WaitForMultipleObjects() is actually entered. And the remaining threads will destroy their handles WHILE it is running. Either way, it fails. You CANNOT destroy the thread handles while you are actively waiting on them.

FreeOnTerminate=True can only be used safely for threads that you start and then forget even exist. It is very dangerous to use FreeOnTerminate=True when you still need to access the threads for any reason (it is especially because of this caveat that TThread.WaitFor() tends to crash when FreeOnTerminate=True - the thread handle and even the TThread object itself gets destroyed while it is still being used!).

You need to re-think your waiting strategy. I can think of a few alternatives:

  1. don't use WaitForMultipleObjects() at all. It is safer, but less efficient, to simply re-lock the list periodically and check if it is empty or not:

    procedure WaitForWorkers;
    var
      ThreadHandleList: TList;
    begin
      repeat
        ThreadHandleList := TWorkerThread.WorkerHandleList.LockList;
        try
          if ThreadHandleList.Count = 0 then Exit;
        finally
          TWorkerThread.WorkerHandleList.UnlockList;
        end;
        Sleep(500);
      until False;
    end;
    
  2. Get rid of WorkerHandleList altogether and use a semaphore or interlocked counter instead to keep track of how many threads have been created and have not been destroyed yet. Exit the wait when the semaphore/counter indicates that no more threads exist.

  3. like Ken B suggested, keep using WorkerHandleList but wait on a manual-reset event that gets reset when the first thread is added to the list (do that in the thread constructor, not in Execute()) and signaled when the last thread is removed from the list (do that in the thread destructor, not in Execute() or DoTerminate()).

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • Thanks. I understand the point about copying the handles. And possible thread termination before the wait, which was my concern in the question. However I don't understand that I can't destroy thread handles while waiting on them. Isn't that the point of waiting on a thread? How else `WaitForMultipleObjects` could return when you pass thread handles to it? – Sertac Akyuz Sep 06 '16 at 20:44
  • You can wait on an *open* thread handle, and it will be signaled when the thread fully terminates at the OS layer, thus satisfying the wait. But the handle has to remain open and valid until that wait is finished, *THEN* you can destroy it. If you *destroy* the thread handle *before/while* it is being waited on, especially if you are waiting on other open handles at the same time, the wait will fail and the error won't be able to tell you *which* handle caused the error so you can remove it from the array and continue waiting on the remaining handles. – Remy Lebeau Sep 06 '16 at 21:21
  • Ok, thanks. `FreeOnTerminate` with a wait function is not an option then, like you've already said :). – Sertac Akyuz Sep 06 '16 at 21:25
  • 3
    While the list is locked, you could *duplicate* all the handles. Then after the list is unlocked, you could wait on those duplicates. They'll remain valid even after the originals are closed. Enhancement opportunities exist regarding when you choose to close any given duplicate. – Rob Kennedy Sep 07 '16 at 01:07
  • True, but at the small cost of allocating new resources for those duplicates. – Remy Lebeau Sep 07 '16 at 01:25
  • Thanks @Rob. That's the kind of solution I was trying to find. I couldn't make it work yet however. Neither with actual handles... Asked a [question](http://stackoverflow.com/questions/39380131/why-does-waitformultipleobjects-fail-with-multiple-thread-handles) about that. – Sertac Akyuz Sep 07 '16 at 22:51
  • I got my answer, I was testing with too many threads (more than `MAXIMUM_WAIT_OBJECTS`). Duplicating the handles seems to work fine, with FreeOnTerminate threads... Since it additionally implicitly forces me to not to use the internal list of `TThreadList`, I'd probably accepted that as an answer. I learned much from Remy's answer however... – Sertac Akyuz Sep 07 '16 at 23:13
5

Assuming a few of things (That only the main thread would start secondary ones, amongst other), the simplest way to fix the issue would be like this :

procedure WaitForWorkers;
var
  ThreadHandleList: TList;
  iItemCount : Integer;
begin
  repeat
    ThreadHandleList := TWorkerThread.WorkerHandleList.LockList;
    try
      iItemCount := ThreadHandleList.Count 
    finally
      TWorkerThread.WorkerHandleList.UnlockList;
    end;
    if iItemCount = 0 then
      BREAK;
    sleep(Whatever_is_suitable);
  until False;
end;

If wasting any cpu cycle, or waiting longer than necessary is not acceptable, you could create an Event and wait on it instead, and have all the thread remove themselves from the list through the same function.

procedure WaitForWorkers;
begin
  Event.WaitFor(INFINITE);
end;

procedure RemoveHandleFromList(AHandle : THandle);
var
  ThreadHandleList: TList; 
  idx : Integer;
begin
  ThreadHandleList := TWorkerThread.WorkerHandleList.LockList;
  try
    idx := ThreadHandleList.IndexOf(Pointer(AHandle));
    if idx >= 0 then
    begin
      ThreadHandleList.Delete(idx);
      if ThreadHandleList.Count = 0 then
        Event.SetEvent;
    end; 
  finally
    TWorkerThread.WorkerHandleList.UnlockList;
  end;
end;

In this case, you'd probably want to use a manual reset event and reset it in a "AddHandleToList" procedure.

Ken Bourassa
  • 6,363
  • 1
  • 19
  • 28