I'm writing something like a file cache and I'm debating between using lock or ConcurrentDictionary. If multiple threads ask for a key, then an ordinary Dictionary will have issues if two threads try to write to it, so I tried ConcurrentDictionary. There's now a secondary problem, how do you prevent the file being read twice (or more) as each thread tries to get the file. I've added sample code to explain what I mean.
Here's a version using lock and Dictionary
class Program
{
private static object locking = new object();
private static Dictionary<string, byte[]> cache;
static void Main(string[] args)
{
cache = new Dictionary<string, byte[]>();
Task.Run(() =>
{
AddToCache("largefile", "largefile.bin");
});
Task.Run(() =>
{
AddToCache("largefile", "largefile.bin");
});
}
static byte[] AddToCache(string key, string filename)
{
lock(locking)
{
if (cache.TryGetValue(key, out byte[] data))
{
Console.WriteLine("Found in cache");
return data;
}
Console.WriteLine("Reading file into cache");
data = File.ReadAllBytes(filename);
cache[key] = data;
return data;
}
}
}
This version does what's expected, it'll protect the dictionary against multiple threads and only read the large file ONCE.
Here's the second version using ConcurrentDictionary:
class Program
{
private static ConcurrentDictionary<string, byte[]> cache;
static void Main(string[] args)
{
cache = new ConcurrentDictionary<string, byte[]>();
Task.Run(() =>
{
AddToCache("largefile", "largefile.bin");
});
Task.Run(() =>
{
AddToCache("largefile", "largefile.bin");
});
}
static byte[] AddToCache(string key, string filename)
{
return cache.GetOrAdd(key, (s) =>
{
Console.WriteLine("Reading file into cache");
return File.ReadAllBytes(filename);
});
}
}
This version protects the dictionary BUT it reads the large file TWICE which isn't what's required. I think I'm doing something wrong here, but not being familiar with GetOrAdd I'm not sure what.
The first version looks fine, but it's a cut down version of the real code, and the lock would be locking a lot of code. The second version looks much simpler but doesn't prevent multiple reading of the file. Is there a way to do this without the lock blocking a lot of code or is that the only answer?