5

How the search results caching works

When a user enters a query to search for:

  • The query is split into an array of tokens
  • A unique hash is created for this array of tokens (order tokens alphabetically then MD5). This is the unique identity of the search.
  • Check cache for results based on hash
  • If cache doesn't exist, save results to cache using hash

Problem I'm trying to solve

If a user performs a search that takes say 10 seconds, and they impatiently refresh the page we don't want it to start the query again. This should be locked.

However, if a expensive query is running, we don't want to lock out other users performing less expensive searches.

To solve this, I need multiple locks.

Implementation

This is how I've got it currently implemented:

private static readonly object MasterManualSearchLock = new object();
private static readonly Dictionary<string, object> ManualSearchLocks = new Dictionary<string, object>();

/// <summary>
/// Search the manual
/// </summary>
public static SearchResponse DoSearch(string query, Manual forManual)
{
    var tokens = Search.Functions.TokeniseSearchQuery(query);
    var tokenHash = Search.Functions.GetUniqueHashOfTokens(tokens);
    var cacheIndex = Settings.CachePrefix + "SavedManualSearch_" + tokenHash;
    var context = HttpContext.Current;

    if (context.Cache[cacheIndex] == null)
    {
        // Create lock if it doesn't exist
        if (!ManualSearchLocks.ContainsKey(tokenHash))
        {
            lock (MasterManualSearchLock)
            {
                if (!ManualSearchLocks.ContainsKey(tokenHash))
                {
                    ManualSearchLocks.Add(tokenHash, new object());
                }
            }

        }

        lock (ManualSearchLocks[tokenHash])
        {
            if (context.Cache[cacheIndex] == null)
            {
                var searchResponse = new SearchResponse(tokens, forManual, query);
                context.Cache.Add(cacheIndex, searchResponse, null, DateTime.Now.AddMinutes(Settings.Search.SearchResultsAbsoluteTimeoutMins), Cache.NoSlidingExpiration, CacheItemPriority.BelowNormal, null);
            }
            ManualSearchLocks.Remove(tokenHash);
        }

    }
    return (SearchResponse)context.Cache[cacheIndex];
} 

Questions

  • Is this a sensible implementation?
  • Is this thread safe?
  • Is including the removal of the lock within the lock itself OK?
Tom Gullen
  • 61,249
  • 84
  • 283
  • 456
  • 1
    There is a small window where `lock (ManualSearchLocks[tokenHash])` can fail because another thread just executed `ManualSearchLocks.Remove(tokenHash);` for the same hash (after the first thread "guaranteed" it is in the dictionary). – Eric J. Sep 03 '15 at 17:42
  • @EricJ. Should the removal of the lock be locked by the master lock? and also, should I be using a different data type for the dictionary of locks, perhaps `ConcurrentDictionary`? – Tom Gullen Sep 03 '15 at 17:43
  • That doesn't change anything because in my scenario you already released your current `lock (MasterManualSearchLock)`, yet if you wrap the entire piece of code in that same lock you will effectively serialize your code. – Eric J. Sep 03 '15 at 17:45
  • @EricJ is it possible to make this function thread safe, or shall I just accept this small window? It feels to me like this is particularly unlikely to occur. – Tom Gullen Sep 03 '15 at 17:48
  • 1
    `System.Runtime.Caching.MemoryCache` – Eser Sep 03 '15 at 17:56
  • 2
    Check out @Eser's suggestion. MemoryCache already handles the problem of multiple threads accessing the cache, provides flexible cache invalidation, etc. – Eric J. Sep 03 '15 at 18:00
  • See this answer http://stackoverflow.com/a/26724549/2707705 – Biscuits Sep 03 '15 at 18:13

1 Answers1

0

Your concurrent use of ManualSearchLocks is unsafe, ConcurrentDictionary is a good replacement. No, just reading from the dictionary is not safe because it is not documented to be safe.

I'd put a Lazy<T> into the cache. There might be multiple such lazy instances produced but only one will be materialized. All threads wanting access to a particular key will call Lazy.Value and automatically synchronize. Once one actual "search" will be executed.

Depending on how you access the cache there might be a small race condition that allows for multiple lazy instances to be executed. That is probably not a big deal in your case.

usr
  • 168,620
  • 35
  • 240
  • 369