4

I'm working on an old and large WPF application. The customer reported a bug, which they were able to reproduce, but I can't. There is a class in the application that looks like this:

public static class PermissionProvider
{
    private static Dictionary<string, bool> Permissions;

    public static void Init()
    {
        Permissions = new Dictionary<string, bool>();
    }
    
    private static object _lock = new object();
    public static bool HasPermission(string permission)
    {
        if (string.IsNullOrEmpty(permission)) return false;

        lock (_lock)
        {
            if (Permissions.ContainsKey(permission)) return Permissions[permission];
            var hasPermission = true; // Expensive call a third party module to check user permissions.
            Permissions.Add(permission, hasPermission);
            return hasPermission;
        }
    }
}

According to the log files provided by the customer, the line Permissions.Add(permission, hasPermission) threw an ArgumentException (key already exists). This doesn't make sense to me; the code checks for the key inside the same lock.

Based on a test run, all calls to HasPermission seem to be made from the main thread. The program uses Dispatcher.BeginInvoke at places, but my understanding is that locking is not even necessary for that. The dictionary is private and not accessed from anywhere else.

In what situation could this exception happen?


My first thought was that the customer was running an old version of the application, but it turns out that this class was only added in the latest one.

This particular exception should be easy enough to avoid by just changing the Permissions.Add(permission, hasPermission) to Permissions[permission] = hasPermission, but I would prefer to understand why it happened first.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
jkiiski
  • 8,206
  • 2
  • 28
  • 44
  • 2
    Is this the whole code of the `PermissionProvider` class? – Theodor Zoulias Jun 22 '21 at 09:02
  • 1
    The only thing missing is the call to the third party. – jkiiski Jun 22 '21 at 09:04
  • 2
    Is there any possibility for `Init` to be called more than once? You might consider replacing it with a static constructor. – JonasH Jun 22 '21 at 09:14
  • 3
    As a side note, the `PermissionProvider` class, which looks like a cache for permissions, is probably quite inefficient. If a thread asks for the permission `"Gazelle"` which is not cached, an expensive check will be invoked while holding the lock. Then a second thread that will ask for the permission `"Rhino"`, which presumably is already cached, will have to wait until the check for the `"Gazelle"` is finished. – Theodor Zoulias Jun 22 '21 at 09:14
  • @JonasH How new `Permissions` object can impact the case _"key already exists"_. If the error would be "key does not exist", then yes. – Rekshino Jun 22 '21 at 09:21
  • _"According to the log files provided by the customer"_ How it being logged? – Rekshino Jun 22 '21 at 09:24
  • @Rekshino The exception is caught higher in the stack and logged via Log4net. – jkiiski Jun 22 '21 at 09:31
  • @TheodorZoulias True. This could certainly be improved. It seems to have been added as a quick solution to the customer complaining about slowness (previously the third party was just called directly). – jkiiski Jun 22 '21 at 09:33
  • An easy way to improve this would be to switch from the `Dictionary` to a `Dictionary>`, and do the expensive check inside the callback of the [`Lazy`](https://learn.microsoft.com/en-us/dotnet/api/system.lazy-1). – Theodor Zoulias Jun 22 '21 at 10:04

3 Answers3

4

It is possible, but hard to tell without the whole source code.

The expensive call

var hasPermission = true; // Expensive call a third party module to check user permissions.

could do something that calls HasPermission() again. Thus, the same thread would enter

lock (_lock) { ... }

again (which is allowed), possibly adding the the permissing, leaving the lock, leaving the method and returning into HasPermission() where it came from, adding the same key again.

This might either require production debugging at your customer. If you're not familiar with that and you can convince your customer to replace the affected DLL for a moment (let him create a backup copy), you could try the following:

lock (_lock) 
{
    var stack = Environment.StackTrace;
    if (stack.Split(new []{nameof(HasPermission)}, StringSplitOptions.None).Length> 2) throw new Exception("I should not be in here twice");
    ...
}

This should crash the application (unless general catch block somewhere) with a call stack that has the affected method twice, thus you can analyze where the second call comes from. Do whatever you would do in such a case: generate a crash dump, analyze your logs, ...

Generating a stack trace is considerably expensive, so this will change timing and thus potentially make the problem disappear. A disappeared problem is not a fixed problem, though.

Thomas Weller
  • 55,411
  • 20
  • 125
  • 222
  • The expensive call is to a third party module that doesn't call back to our application code. – jkiiski Jun 22 '21 at 08:52
  • 2
    @jkiiski are you sure there no way another call could happen on the same thread? This might happen in unexpected ways. If the UI thread is involved, the third party code could enter some kind of blocking state that still pumps messages, potentially allowing arbitrary code to run on the UI thread. – JonasH Jun 22 '21 at 09:11
  • 1
    @JonasH I'll have to investigate that possibility. We don't have the source code for the third party code, but I suppose I can try to check the call stacks for every call to `HasPermission` to see where they are coming from. – jkiiski Jun 22 '21 at 09:29
  • @JonasH _"the third party code could enter some kind of blocking state that still pumps messages, potentially allowing arbitrary code to run on the UI thread"_ how do you imagine it? `await` is not allowed in `lock`, so the software will freeze on blocking it in one event handler and another event handler will not be called on GUI dispatcher. – Rekshino Jun 22 '21 at 10:07
  • There do seem to be calls to `HasPermission` coming through `CanExecute` methods on commands, triggered by `PropertyChanged` events. Perhaps the third party code somehow ends up triggering those (although I'm not sure how that would happen). – jkiiski Jun 22 '21 at 10:18
  • 1
    @Rekshino The simplest way to start the message pump might be to call `.ShowDialog` on a control, or call `Application.DoEvents`. But there are various rules about what messages are pumped in what situations, and I do now know all the details. But it is prudent to make as few assumption as possible about third party code. – JonasH Jun 22 '21 at 10:34
  • @JonasH: Totally agree for DoEvents(). It was quite common for programs migrated from VB6. I had hoped it disappeared in the meanwhile :-) – Thomas Weller Jun 22 '21 at 11:36
  • It'll take more work to figure out how exactly this problem happens, and how best to fix it, but this answer is almost certainly correct that it's caused by the third party code resulting in calling the same method again, so I'll mark it accepted. – jkiiski Jun 22 '21 at 12:06
  • 2
    @JonasH @ThomasWeller Just for your interest, there is no `DoEvents` method in the `Application` type in WPF. See _[Where is the Application.DoEvents() in WPF?](https://stackoverflow.com/q/4502037/6181599)_ for reference. – thatguy Jun 22 '21 at 12:56
2

I agree with Thomas Weller that the most likely reason is that the same thread reenter the lock for some reason. But i wanted to suggest another approach to these kinds of problems.

Holding a lock while calling arbitrary code can be dangerous, it may lead to deadlocks and various other issues. To limit such risks it is a good idea to only hold locks for short sections of code, only call code you know is safe, and does not raise events or can run arbitrary code some other way.

One option would be to switch to a 'publication only' model for thread safety that releases the lock while calling the 'expensive method'. This might allow multiple threads to call the expensive method at the same time, and this might or might not be an issue in your particular case. Something like:

lock (_lock)
{
     if (Permissions.ContainsKey(permission)) return Permissions[permission];
}
var hasPermission = true; // Expensive call a third party module to check user permissions.
lock (_lock)
{
     if (Permissions.ContainsKey(permission)) return Permissions[permission];
     Permissions.Add(permission, hasPermission);
     return hasPermission;
}

Or use ConcurrentDictionary.GetOrAdd that does more or less the same thing.

I would also caution against mutable global state in general since this can also make code hard to read and predict.

JonasH
  • 28,608
  • 2
  • 10
  • 23
1

As pointed out by JonasH in a comment, the Init method looks highly suspicious. Your program could crash if this method is not called exactly once. If you are not sure how many times it's called, at least protect the code it contains with the same lock.

public static void Init()
{
    lock (_lock)
    {
        Permissions = new Dictionary<string, bool>();
    }
}
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • 1
    The `Init` method should only be called once in the application startup, but I'll add this to be sure. – jkiiski Jun 22 '21 at 10:14