0

We are attempting to read information from a hardware device, and the only way to do this is to communicate with it through a closed-source, native DLL the hardware manufacturer provides. They also provide a .NET wrapper to access the DLL, which the wrapper method of concern is simplified below:

[DllImport("hardware_mfg.dll")]
private static extern int hardware_command_unicode(MarshalAs(UnmanagedType.LPWStr)] string outdata, uint outcount, [MarshalAs(UnmanagedType.LPWStr)] StringBuilder indata, uint maxdata, ref uint incount);

public static int HardwareCommand(string senddata, StringBuilder recdata)
{
    uint incount = 0;
    return HardwareWrapper.hardware_command_unicode(senddata, (uint)senddata.Length, recdata, (uint)recdata.Capacity, ref incount);
}

The code calling the HardwareWrapper.HardwareCommand function is:

// This method gets called from a DispatcherTimer.Tick event
public static int SendCommand(string command, StringBuilder buffer)
{
    lock (_locker)
    {
        try
        {
            if (_commandInProgress)
            {
                // This exception gets thrown
                throw new InvalidOperationException("This should not be possible");
            }
            _commandInProgress = true;
            // InvalidOperationException gets thrown while previous call to HardwareCommand here has not yet returned
            var result = HardwareWrapper.HardwareCommand(command, buffer);
            return result;
        }
        finally
        {
            _commandInProgress = false;
        }
    }        
}

The confusing part is that the InvalidOperationException gets thrown. When the main thread enters the var result = HardwareWrapper.HardwareCommand(...) it is possible for the method to be called again, and enter the same function before the first call returns. It is intermittent that the exception gets thrown, but letting this code run for 15-30 seconds will be enough to have the exception happen.

  1. How is it possible for the main thread to exist twice in one method?
  2. What can be done to prevent this from happening?

EDIT 1: Move lock to outer scope

Clark
  • 478
  • 6
  • 14
  • Reentrant calls are pretty normal. Function A calls function B which calls function A. That's what you have got. Presumably `HardwareWrapper.HardwareCommand` leads to the timer tick event firing again – David Heffernan Aug 22 '16 at 13:18
  • HardwareWrapper.HardwareCommand is unmanaged code, and it does not call SendCommand or cause the timer tick event to fire. The DispatcherTimer event can fire again while the main thread is in the unmanaged code -- but shouldn't the main thread be waiting for the unmanaged code to return before reentering this method? – Clark Aug 22 '16 at 13:19
  • Are you sure it's not a race condition on the `Tick` being executed again before `finally` is entered and flag reset? Try resetting the flag inside the `lock` – slawekwin Aug 22 '16 at 13:20
  • Would the lock have any impact, as it is the **same thread** that is re-entering, and not a different thread (the mfg DLL specifies it does not support multi threaded access, even in parallel. E.g. only one thread can ever make calls to it) – Clark Aug 22 '16 at 13:21
  • I don't see how a single thread can execute two distinct parts of the code, thus I assume there has to be more than one thread executing it, even though `DispatcherTimer` is supposed to call handlers on the UI thread. If you are 100% sure it's not it, then don't check it – slawekwin Aug 22 '16 at 13:24
  • 3
    Clearly `HardwareWrapper.HardwareCommand` does cause the timer to tick again. You should be able to see this by looking at the call stack. Of course, I am assuming that the information you present is correct. – David Heffernan Aug 22 '16 at 13:24
  • 2
    Your _locker cannot lock anything so get rid of that first. It can't do its job since the Tick event always fires on the same thread. Which leaves the `bool` variable as the only way to avoid the DoEvents-style re-entrancy problem. Feel free to send a nasty-gram to the author of this code. – Hans Passant Aug 22 '16 at 13:25
  • Moving the lock to the outmost scope doesn't resolve the issue; the exception is still throwing. I'm guessing that interop marshalling is a culprit here? – Clark Aug 22 '16 at 13:25
  • That's not guessing. That's wishful thinking. Look at the evidence, and stop guessing, and stop hoping. – David Heffernan Aug 22 '16 at 13:26
  • @DavidHeffernan how exactly would their code be _causing_ the timer to tick? If you are implying that it is releasing the main thread to tick, then yes, but it has no access or knowledge of the dispatcher timer – Clark Aug 22 '16 at 13:30
  • 2
    I'd guess that the dispatcher timer is run off the system message queue, and that your unmanaged code is pumping the queue – David Heffernan Aug 22 '16 at 13:56
  • @KingArthur: How are you creating the _locker variable? Is it static? are you making sure it is being created only once? It is usually a good idea to make the locks static readonly to make sure they are created only once. – Milton Hernandez Aug 22 '16 at 14:15
  • 1
    @MiltonHernandez Everything happens on the one thread, so the lock is quite irrelevant – David Heffernan Aug 22 '16 at 14:20
  • @MiltonHernandez - `private static readonly object _locker= new object();` And @DavidHeffernan @HansPassant it is required because there are situations unrelated to this question where another thread can call the function. I wanted to include it attempt to avoid "add a lock" responses – Clark Aug 22 '16 at 14:21

1 Answers1

1

Without a good Minimal, Complete, and Verifiable code example it's impossible to provide a specific and complete diagnosis. But based on the information here, undoubtedly this is exactly as @David says: "your unmanaged code is pumping the queue".

COM in particular is notorious for this, but it can occur in other ways. Typically, the native code enters some kind of wait state in which some or all messages for the thread are still dispatched. This can include the WM_TIMER message for the timer, causing the Tick event to be raised again, even before the previous event handler has returned.

Since it's in the same thread, the lock is irrelevant. The Monitor that is used by lock only blocks threads other than the one holding the lock; the current thread can re-enter any section of code protected by that monitor as often as it wants.

The message in your InvalidOperationException, "This should not be possible", is incorrect. It is possible, and it should be possible. For better or worse, it's how messages in Windows work.

Depending on what your goal is and the specifics of the code involved (which you haven't provided), you have at least a couple of options:

  1. Don't use DispatcherTimer. Instead, use one of the other timer classes, which use the thread pool to raise timer events. These don't rely on a message queue and so pumping messages won't affect how the timer event is raised. Of course, this assumes you don't need to execute the code in the UI thread. Whether this is the case in your situation is not clear from the question. (Actually, it is possible to get this approach to work even if you do need to execute some code in the UI thread while holding the lock, but it gets tricky…better to avoid doing that if you can help it.)

  2. Use the _commandInProgress variable to detect the situation and ignore the timer event if the flag is already set to true. Of course, this assumes you don't need to execute the command on every timer event, and that there's some reasonable way to skip doing so (including dealing with the lack of a result value from the call to the native code). Again, there's not enough information in the question to know if this is the case.

Community
  • 1
  • 1
Peter Duniho
  • 68,759
  • 7
  • 102
  • 136
  • To add missing details, 1. this 3rd party hardware DLL monitors what thread establishes communication with it, and then all future communication **must** occur on that same thread. We are using the UI thread for starters (hence the `DispatcherTimer`), and if we can get it working completely may move everything to a long-running background `Thread()` instance. 2. Not all timer events require command execution, but unfortunately _some_ do. Where can I learn more about "your unmanaged code is pumping the queue" – Clark Aug 23 '16 at 16:58
  • @King: _"monitors what thread establishes communication with it, and then all future communication must occur on that same thread"_ -- that sounds a lot like a STA model COM object. You may find that you will need to explicitly set any alternative thread to STA if you move to using the object in a different thread (this is done for you by Winforms or WPF for the UI thread). – Peter Duniho Aug 23 '16 at 17:30
  • On the bright side, in that case you could implement your timer simply by using `Thread.Sleep()` -- not ideal, but since you don't need the thread to do anything else, shouldn't be that bad. More to the point, it would avoid the re-entrancy. You might find http://stackoverflow.com/questions/21211998/stataskscheduler-and-sta-thread-message-pumping and https://blogs.msdn.microsoft.com/cbrumme/2004/02/02/apartments-and-pumping-in-the-clr/ useful. Additional details can be found on MSDN. – Peter Duniho Aug 23 '16 at 17:30
  • How long does the hardware command take to complete? One obvious potential problem with the current implementation is that if it's required to execute in the UI thread, then you are necessarily blocking the UI thread, at least partially, for the duration. This may be why the DLL is set up to pump messages, but as you've seen that introduces other problems. – Peter Duniho Aug 23 '16 at 17:34
  • Thank you for the help so far & additional info / links. And you are right, but luckily the calls aren't too long (anywhere from 20 - 80ms) so most of the time the UI blocking is not perceivable to the user. My confusion still is with when the main WPF UI thread executes the unmanaged code, how the unmanaged code is able to treat the WPF UI thread differently and start acting on it in ways that either I'm not aware of in .NET or don't exist (for ex, to have it yield back and start acting on other messages) – Clark Aug 23 '16 at 18:07
  • _"to have it yield back and start acting on other messages"_ -- it's important to understand it's not really an issue of "yielding", but of actually finding and dispatching window messages from the thread's message queue. It's similar to how `ShowDialog()` works. E.g. the user clicks a button, your button event handler calls `ShowDialog()`, but even before the handler returns, all the window messages are still dispatched. This is because the call to `ShowDialog()` sets up a new message dispatch loop. The 3rd party DLL is obviously doing something similar... – Peter Duniho Aug 23 '16 at 19:21
  • ... Specifically how I can't say. You'd have to ask the author of the DLL. – Peter Duniho Aug 23 '16 at 19:21
  • I was able to resolve the issue, basically by blocking all click / timer operations until the MFG's DLL returned a response. I probably poorly worded my confusion, but to clarify, my understanding is that a thread cannot leave a function to do other work unless the function is asynchronous. Since the wrapper I provided above is synchronous, I'm unaware of the facilities that make it possible for the thread to execute other code until returning from this function call (_the message queue?_). I will mark this as the answer either way as the information provided was helpful. – Clark Aug 24 '16 at 13:51
  • 1
    @KIng: the re-entrancy is actually very simple: at the top of the thread is a loop that checks the message queue. For each message in the queue, your code is called to handle the message. But you can call other code that itself creates a loop that checks the message queue. That loop can then pull messages from the queue and dispatch them, just as the loop at the top of the thread does. And in doing so, it can call your code again, before returning control to the previous message-handling call. – Peter Duniho Aug 24 '16 at 15:33