0

Preamble: it's a self-assigned and pure syntetic task to learn (and remember what I already knew) C# threads and synchronization and data structures.

The story:

Let's say I have a dictionary <string, string> that represents a path (http) to a file by some key, ie:

foo => http://domain.tld/file1
bar => http://domain2.tld/file2

And I'd like to implement a class that will implement an interface with 2 methods:

String Rand();
String Get(String key);

The first method would pick the file randomly from all the available, and the Get would return a particular file, or to be precise - local path to a downloaded file.

The class should be thread-safe, so that if several threads request the same key with Get() or the Rand() picks the same item - then only one thread should actually download a file to a local drive, or the path should be retrieved immediately if a file has already been downloaded.

So, that's where I'm in stuck.

How would I synchronize a "downloader" so that the same file wasn't being downloaded twice?

How could I limit amount of simultaneous downloads?

PS: I'm not asking about any code, just a keywords to a data structures, classes and patterns that would be useful for this task.

PPS: the task is 100% abstract so if you think that some changes to the requirements can make it more interesting/useful for me (as a learner) - you're welcome with changes.

zerkms
  • 249,484
  • 69
  • 436
  • 539
  • @I4V: MSDN links actually answer all what I needed thanks for that, except synchronisation threads by key – zerkms May 09 '13 at 10:05
  • @I4V: but how would I check if a file already presents locally or is being downloaded in thread safe manner? – zerkms May 09 '13 at 10:27
  • 1
    If you're in for the learning, why not learn the *right* way to do it? No process likes blocked threads in waiting. Use [await async](http://msdn.microsoft.com/en-us/library/vstudio/hh191443.aspx). – Remus Rusanu May 09 '13 at 12:44
  • @Remus Rusanu: yep, `await` and `async` is the second stage. For a proof of concept I've implemented http://pastebin.com/LJnK7FB6 – zerkms May 09 '13 at 12:46
  • @Remus Rusanu: http://pastebin.com/LdFvPDbQ --- ta-dah ;-) The same with async/await. What do you think of it? – zerkms May 14 '13 at 10:37
  • @Jaroslaw Waliszko: here I implemented the same thread-safe (I hope so) solution with await/async: http://pastebin.com/LdFvPDbQ – zerkms May 14 '13 at 10:38
  • `map ?? new Dictionary();` shouldn't this be `new CocurrentDictionary<...>`? – Remus Rusanu May 14 '13 at 10:42
  • Also the main `var map = new Dictionary()` – Remus Rusanu May 14 '13 at 10:43
  • @Remus Rusanu: nope, they aren't modified concurrently – zerkms May 14 '13 at 10:43
  • 1
    yes, u're right. But `_storage` is. – Remus Rusanu May 14 '13 at 10:46
  • @Remus Rusanu: got it, it should be indeed – zerkms May 14 '13 at 10:52
  • 1
    Damn, this thing used to be *hard*. With async await is just a natural flow :) But you skipped over the real meat and potatoes. [`WebRequest.GetResponseAsync()`](http://msdn.microsoft.com/en-us/library/system.net.webrequest.getresponseasync.aspx), [`Stream.WriteAsync()`](http://msdn.microsoft.com/en-us/library/system.io.stream.writeasync.aspx) etc – Remus Rusanu May 14 '13 at 10:58
  • @Remus Rusanu: Yep, the work (thanks to `await`) will be incorporated naturally by replacing that `Task.Delay` :-) (if only we don't handle errors) – zerkms May 14 '13 at 11:01
  • @Remus Rusanu: finally I have found that there is a race condition in that code :-S So couldn't find anything better than using `lock` :-( http://pastebin.com/9TEtidNz – zerkms May 21 '13 at 09:06
  • [`TryAdd`](http://msdn.microsoft.com/en-us/library/dd267291.aspx). Only the successful Add have to submit the Task. – Remus Rusanu May 21 '13 at 10:48
  • @Remus Rusanu: `TryAdd` would be used **after** you've run another file download task. So there is still a race – zerkms May 21 '13 at 11:04

1 Answers1

0

So the "final" version of a "downloader" class that satisfies requirements and uses await/async is:

class Downloader
{
    private IDictionary<string, string> _map;
    private IDictionary<string, string> _storage = new ConcurrentDictionary<string, string>();
    private ConcurrentDictionary<string, Task<string>> _progress = new ConcurrentDictionary<string,Task<string>>();

    public Downloader(IDictionary<string, string> map)
    {
        _map = map ?? new Dictionary<string, string>();
    }

    public async Task<string> Get(string key)
    {
        string path;

        if (!_map.TryGetValue(key, out path))
        {
            throw new ArgumentException("The specified key wasn't found");
        }

        if (_storage.ContainsKey(key))
        {
            return _storage[key];
        }

        Task<string> task;
        if (_progress.TryGetValue(key, out task))
        {
            return await task;
        }

        task = _retrieveFile(path);

        if (!_progress.TryAdd(key, task))
        {
            return await Get(key);
        }

        _storage[key] = await task;
        return _storage[key];
    }

    private async Task<string> _retrieveFile(string path)
    {
        Console.WriteLine("Started retrieving {0}", path);
        await Task.Delay(3000);
        Console.WriteLine("Finished retrieving {0}", path);
        return path + " local path";

    }
}

The whole code with example output: http://pastebin.com/LdFvPDbQ

zerkms
  • 249,484
  • 69
  • 436
  • 539