-1

I would like some advice on where I can improve or change the design of the current code.

There is a Manager class that is set-up and has a Calculation method that is called by different threads. Each thread has got one resource that it needs to calculate. These resources could belong to different companies. For each company, we want to cache some data, but we can only get the company data via the resource when the Calculate method is called.

So my idea at present, is to have a Dictionary in the Manager class with the companyResourceTag as the Key. When Calculate is called, the companyResourceTag is determined and a method CheckCachedData is called.

private void CheckCachedData(int companyResourceTag)
    {
        _ReaderWriterLock.EnterUpgradeableReadLock();
        try
        {
            if (!_CompanyCachedData.ContainsKey(companyResourceTag))
            {                 
                    Elements elements = ElementService.GetAllElements();
                    DataElements dataElements = ElementService.GetAllDataElements();
                    CalendarGroupings calendarGroupings = CalendarService.GetAllCalendarGroupings();                       

                    CachedDataContainer cachedItem = new CachedDataContainer(elements, dataElements, calendarGroupings);

                    _ReaderWriterLock.EnterWriteLock();
                    try
                    {
                         _CompanyCachedData.Add(companyResourceTag, cachedItem);
                    }
                    finally
                    {
                        _ReaderWriterLock.ExitWriteLock();
                    }
            }
        }
        finally
        {
            _ReaderWriterLock.ExitUpgradeableReadLock();
        }
    }

If there hasn't been a previous resource from this company, then this company's data must be fetched via the Services. The underlying tables do not change often, and we can assume that the tables will remain the same during the time that the calculations are run for all the resources. However, fetching this data is very time-consuming. Hence the need for caching.

There may be say 100 different companies, and 30000+ resources that will have to be calculated. There are a few other places (per resource) where this cached data is read from, for example:

        _ReaderWriterLock.EnterReadLock();
        try
        {
            _CompanyCachedData.TryGetValue(companyResourceTag, out cachedDataContainer);
        }
        finally
        {
            _ReaderWriterLock.ExitReadLock();
        }

       //Do something with cachedDataContainer

I didn't try to make the code more elegant because of Eric Lathrop's comment here: Possible problem

I didn't use a ConcurrentDictionary, because of the issue mentioned here:Possible problem with ConcurrentDictionary

I am not sure whether normal locks would be better than ReaderWriterLockSlim, but I like the idea that there could be more than one reader at a time, and that I can upgrade the lock. At present I am also more concerned about correctness than speed.

Have I used the RWLS correctly? Would you agree on my current use of the UpgradeableReadLock? Would you agree on my choice of not using ConcurrentDictionary?

Community
  • 1
  • 1
Igavshne
  • 699
  • 7
  • 33
  • "If however, I wanted to improve the speed" -- meaning, what? Absent a clearly stated observed performance problem, along with a clearly stated performance goal, that question is way too broad. What evidence do you have that the code you have now isn't already plenty fast? Given that you always have the possibility that you'd have to fetch data over the network, and that this delay is acceptable, it's hard to believe that even a single simple lock would cause any real problem. If RWLS is working, what _specific_ question do you have here? – Peter Duniho Mar 21 '15 at 12:10
  • @Peter Duniho: I have edited the original post so as to be more specific. – Igavshne Mar 23 '15 at 08:45

1 Answers1

2

Absent a good, minimal, complete code example it's impossible to say anything for sure about the code. However, the code you posted seems to be fine in terms of basic correctness.

But…

Would you agree on my current use of the UpgradeableReadLock?

I would normally expect to see retrieval and cache storage to be part of the same operation. I.e. to have a single method that primarily is responsible for retrieving a value, and if not found would then fetch the value the slow way, store that in the cache, and then return.

Splitting the cache retrieval and original fetching into two separate methods seems odd and possibly buggy (but there, without a good code example, I can't say for sure).

(Note: by the above, I simply mean the API that the caller sees. Of course, it makes sense that the actual cache implementation would break the actual data fetching into a separate method. It's just that I would expect to see that method called by the same method used to retrieve values from the cache, so that you only have to take the read lock once in the event that the value is in fact already in the cache).

Would you agree on my choice of not using ConcurrentDictionary?

Again, lacking more context it's hard to say. I tend to try to avoid ConcurrentDictionary because it has somewhat harder-to-use semantics as compared to the conventional Dictionary<TKey, TValue> class, at least when taking advantage of its concurrency-specific features.

But in your case, I would think that those concurrency-specific features could be what you're looking for. In particular, the TValue GetOrAdd(TKey key, Func<TKey, TValue> valueFactory) method implements the semantics a cache normally would require: retrieve an existing value, or populate the dictionary with a new value if the key is not present.

On the other hand, there's an important caveat with that method: the valueFactory delegate could be invoked multiple times if the method is called concurrently from multiple threads. Only one of the results will actually be used to populate the dictionary, and it may or may not have a local performance impact, but it obviously could affect the overall load on the server from which you are fetching the data.

If that's a concern, then probably the ReaderWriterLockSlim is a better choice, as it allows you to ensure there's only one writer.

On the third hand, I'll note my previous comment regarding the relative performance here. That is, it's not clear that the potential performance advantage of ReaderWriterLockSlim is actually important here. In theory, there should be less overhead in the already-populated case as compared to using the Monitor class (i.e. lock statement), but I would think you'd have to be dealing with a very high level of contention for that to actually play out.

It's not even clear from your example how companyResourceTag maps to a given group of data for the cache key, never mind how frequently you expect to see code retrieving already-fetched data from the cache (given the assumed cost of fetching the data, I presume the synchronization cost of a cache miss isn't of concern). So there's a lot of important detail missing from the question that would be needed for a precise answer.

I guess if there's a very high level of contention, and misses are very rare, ReaderWriterLockSlim might give you a measurable performance improvement. But otherwise, I would expect lock to offer simpler code with perfectly acceptable performance.

Granted, that's all pure speculation. Again, without a complete code example it's not really possible to say for sure what synchronization mechanism would be best in your scenario. But given the assumed cost of fetching the data, it does seem as though managing your own synchronization may be better than using ConcurrentDictionary<TKey, TValue>, given that the latter doesn't provide a direct way to avoid fetching the data more than once.

Whether you should be using ReaderWriterLockSlim instead of the simpler lock approach, is even more speculative. It's possible the former is better, but absent more information it's just as likely it's not.

Community
  • 1
  • 1
Peter Duniho
  • 68,759
  • 7
  • 102
  • 136
  • surely the third hand is the other other hand, or is that the first hand again? :-) – Jodrell Mar 23 '15 at 14:43
  • @Peter Duniho, Thanks your answer has been useful so far. I will soon edit the code to clarify it more. However, as I understand it, there can only be one thread that holds the UpgradeableReadLock (although it allows other threads to finish if they were holding a Readlock), and therefore I thought that the point you made under the third hand is not possible. [MSDN link](https://msdn.microsoft.com/en-us/library/system.threading.readerwriterlockslim%28v=vs.110%29.aspx) – Igavshne Mar 24 '15 at 07:30
  • @Mikrur: sorry, yes you are correct. I should have double-checked my memory before writing that. I've edited the question to remove that erroneous comment, and to clarify my overall (and admittedly vague) thoughts and conclusions based on the (admittedly vague) problem statement. – Peter Duniho Mar 24 '15 at 14:43
  • @Peter Duniho: My apologies for the vague question. I can not provide a minimal, complete and verifiable program as the link explains in any reasonable amount of time at present. CompanyResourceTag is used in methods to retrieve data (such as those for Elements, DataElements and CalendarGroupings) which I didn't include in the code block. I can understand that you have been frustrated by this, but thank you very much for the time you have taken in giving your speculative answers. – Igavshne Mar 31 '15 at 15:48