C# Scan tree recursively with multiple threads

Question

I'm scanning some directory for items. I've just read Multithreaded Directory Looping in C# question but I still want to make it multithreated. Even though everyone says the drive will be the bottleneck I have some points:

The drives may mostly be "single threaded" but how you know what they going to bring up in the future?
How you know the different sub-paths you are scanning are one the same physical drive?
I using an abstraction layer (even two) over the System.IO so that I can later reuse the code in different scenarios.

So, my first idea was to use Task and first dummy implementation was this:

public async Task Scan(bool recursive = false) {
    var t = new Task(() => {
        foreach (var p in path.scan) Add(p);
        if (!recursive) return;
        var tks = new Task[subs.Count]; var i = 0;
        foreach (var s in subs) tks[i++] = s.Scan(true);
        Task.WaitAll(tks);
    }); t.Start();
    await t;
}

I don't like the idea of creating a Task for each item and generally this doesn't seem ideal, but this was just for a test as Tasks are advertised to automatically manage the threads...

This method works but it's very slow. It takes above 5s to complete, while the single threated version below takes around 0.5s to complete the whole program on the same data set:

public void Scan2(bool recursive = false) {
    foreach (var p in path.scan) Add(p);
    if (!recursive) return;
    foreach (var s in subs) s.Scan2(true);
}

I wander what really goes wrong with fist method. The machine is not on load, CUP usage is insignificant, drive is fine... I've tried profiling it with NProfiler it don't tell me much besides the program sits on Task.WaitAll(tks) all the time.

I also wrote a thread-locked counting mechanism that is invoked during addition of each item. Maybe it's the problem with it?

#region SubCouting
public Dictionary<Type, int> counters = new Dictionary<Type, int>(); 
private object cLock = new object();
private int _sc = 0;
public int subCount => _sc;
private void inCounter(Type t) {
    lock (cLock) {
        if (!counters.ContainsKey(t)) counters.Add(t, 1);
        counters[t]++;
        _sc++;
    }
    if (parent) parent.inCounter(t);
}
#endregion

But even if threads are waiting here, wouldn't the execution time be similar to single threaded version as opposed to 10x slower?

I'm not sure how to approach this. If I don't want to use tasks, do I need to manage threads manually or is there already some library that would fit nicely for the job?

I suspect that launching this many tasks ([you should be using `Task.Run`](https://stackoverflow.com/a/29693430/14357), not creating task instances and `.Start`ing them) is creating a backlog of ThreadPool jobs causing ThreadPool starvation (the ThreadPool has a finite number of threads and a queue... it only fires up extra threads when it becomes clear that the queue isn't shrinking... this measurement takes time causing latency in your app). There's really no reason to manually spawn additional threads for IO... your app will be able to keep up. Consider the dedicated `async`... — spender, Nov 25 '18 at 21:28
... versions of the file IO commands you are using. And stop abusing threads. — spender, Nov 25 '18 at 21:28
BTW I voted to close because although this question is about file IO, there's no code pertaining to file IO anywhere in your question. — spender, Nov 25 '18 at 21:31
_abusing threads_ very much to the point. threads are for spreading workload and to do so has its costs. No workload here you can spread, so all you get it cost. — TaW, Nov 25 '18 at 21:32
This question IS NOT about file `IO` but about recursive algorithms using multiple threads. I thought I stated that clearly. Right now I'm using the algorithm on file system but this is an abstraction layer that is independent of `IO` and I can switch underlying system to something else. — Paweł Audionysos, Nov 25 '18 at 22:08
And the whole point of my question is about HOW to not to abuse threads in such case, and you tell me to not abuse threads... — Paweł Audionysos, Nov 25 '18 at 22:11
Also `Add(p)` method can possibly later invoke some additional code that could be expensive so the idea was to spawn new thread only when it's needed. I'm not sure how Tasks are implemented under the hood so I just tested. — Paweł Audionysos, Nov 25 '18 at 22:16
If the question is purely about multithreading and not I/O, it would be nice to post a full version with a simulated workload, that we can run locally to check what the bottleneck is — Kevin Gosse, Nov 25 '18 at 22:17
Well I could post it but, I'm not sure how. The whole code is about 1K lines of undocumented code, not finished and not fully tested, so I don't know... Should I post it on github or something? — Paweł Audionysos, Nov 25 '18 at 22:22
Can't you write a minimal example? If you need 1k lines to reproduce the performance issue, there's no way we're going to diagnose it with your code snippets — Kevin Gosse, Nov 25 '18 at 22:25
I don't know... I thought the ones I posted in question are minimal examples. There is not much going on besides the code I posted - it's just a data tree. The `Add(p)` just adds a child node, and call `inCouter()` tho notify the parent that it has grown. The path.scan just enumerates the paths in data system. Other than that it's just bunch of overloads and utilities in the library that are not invoked. — Paweł Audionysos, Nov 25 '18 at 22:41
I could extract only the methods used for simplicity, it would be probably around 50 lines but it would require great amount of work right now... i mean I'm going to sleep right now. — Paweł Audionysos, Nov 25 '18 at 22:47

score 2 · Accepted Answer · answered Nov 26 '18 at 00:28

I think you almost got it. Task.WaitAll(tks) is the problem. You block one thread for this as this is synchronous operation. You get out of threads soon, all threads are just waiting for some tasks that have no threads to run on. You can solve this with async, replace the waiting with await Task.WhenAll(...). It would free the thread when waiting. With some workload the multithreaded version is significantly faster. When just IO bound it is roughly equal.

ConcurrentBag<string> result = new ConcurrentBag<string>();
List<string> result2 = new List<string>();

public async Task Scan(string path)
{
    await Task.Run(async () =>
    {
        var subs = Directory.GetDirectories(path);
        await Task.WhenAll(subs.Select(s => Scan(s)));

        result.Add(Enumerable.Range(0, 1000000).Sum(i => path[i % path.Length]).ToString());
    });
}

public void Scan2(string path)
{
    result2.Add(Enumerable.Range(0, 1000000).Sum(i => path[i % path.Length]).ToString());

    var subs = Directory.GetDirectories(path);
    foreach (var s in subs) Scan2(s);
}

private async void button4_Click(object sender, EventArgs e)
{
    string dir = @"d:\tmp";

    System.Diagnostics.Stopwatch st = new System.Diagnostics.Stopwatch();
    st.Start();
    await Scan(dir);
    st.Stop();
    MessageBox.Show(st.ElapsedMilliseconds.ToString());

    st = new System.Diagnostics.Stopwatch();
    st.Start();
    Scan2(dir);            
    st.Stop();
    MessageBox.Show(st.ElapsedMilliseconds.ToString());

    MessageBox.Show(result.OrderBy(x => x).SequenceEqual(result2.OrderBy(x => x)) ? "OK" : "ERROR");
}

I think there is nothing more to say, you just returned my faith in SO, You are a Hero! :) Well, the select method did't worked for me, but I changed it to the previous `Task[]` method and it works like a charm. I'm going to accept your answer tomorrow because I want to add some extra bounty if you like :) — Paweł Audionysos, Nov 26 '18 at 17:18

C# Scan tree recursively with multiple threads

1 Answers1