I am writing a web service that generates, caches and serves zip files.
If a requested file doesn't exist in the cache, it is generated and then served. Depending on the request, it can take quite some time to generate this file. It is possible for another request for the same zip file to come in as it is still being generated on the first request.
A basic scenario might go like this
- thread 1: Give me bigfile.zip
- thread 1: bigfile.zip doesn't exist
- thread 1: Generating bigfile.zip
- thread 2: Give me bigfile.zip
- thread 2: Thread 1 is generating bigfile.zip - wait for it to finish
- thread 1: Finished generating bigfile.zip
- thread 1: Serving bigfile.zip
- thread 2: Serving bigfile.zip
So I am considering using a Thread
to achieve this and using Join()
to synchronise the them once the file is ready.
But here I have a problem. How would I go about managing several requests for several different files? I was thinking of using a Dictionary<fileId, Thread>
to keep track of them, but then how could I safely remove a thread from the dictionary when it has finished its process? I can't see any way of doing it without putting a lock around the whole thing - including the actual process itself. Of course, doing that would seem to make the whole idea of threading redundant in the first place.
lock(_myLocker)
{
if(!fileThreads.containsKey(fileId))
{
Thread myThread = MakeMeAThread();
fileThreads.add(fileId, myThread);
}
fileThreads[fileId].Join();
//We have to do the Join inside the lock, this is the only way we know (in a threadsafe manner) that the dictionary definitely contains our key
}
ServeTheFile();
//How do I clean up the no longer required fileThreads[fileId]?
To add to the difficulty, there is another way of consuming the service that simply tells the client the status of the file being requested (unavailable (404), being generated, ready).
- thread 1: Give me bigfile.zip
- thread 1: bigfile.zip doesn't exist
- thread 1: Generating bigfile.zip
- thread 2: Give me bigfile.zip
- thread 2: Thread 1 is generating bigfile.zip - wait for it to finish
- thread 3: Do you have bigfile.zip? - No, it's being generated
- thread 1: Finished generating bigfile.zip
- thread 1: Serving bigfile.zip
- thread 2: Serving bigfile.zip
- thread 4: Do you have bigfile.zip? Yes, it's ready for you
- thread 5: Do you have invalid.zip? No, that's an invalid request
So, can you see why we can't just put a lock around the process? If we did, Thread 3 couldn't be told that the file is being generated and would have to wait for the file generation to finish.