-1

I am scanning the folder in network drive through iteration (do while loop basically).To speed up the process I need to utilize the threading or task for each folder and get result in List having attributes path,filename,lastwriteutc,size. I have done below.

string startFolder = @"D:\Development\TempScan";
string startFolder2 = @"D:\RemotePc";
List<string> foldersToScan = new List<string>();
foldersToScan.Add(startFolder);
foldersToScan.Add(startFolder2);

foreach (var folder in foldersToScan)
{
Thread thread = new Thread(() => IterateFolder(findInfoLevel, additionalFlags, folder))
{
      Name = "Thread " + folder

};

thread.Start();
Console.WriteLine(thread.Name.ToLower() + " has started");
//thread.Join();
}
// Database operation here 

If I used thread.Join() method code become synchronous and then only I can perform database operation and if i dont used thread.Join() then database operation will be called before scanning operation.

I have also used Task as below but its also synchronous

Task task1 = Task.Run(() =>
{
foreach (var folder in foldersToScan)
{
IterateFolder(findInfoLevel, additionalFlags, folder);
}
});
task1.Wait(); 

How to achieve asynchronous scanning of folders one thread per folder and once I get result asynchronous database insertion?

stuartd
  • 70,509
  • 14
  • 132
  • 163
MiralShah
  • 19
  • 11
  • @stuartd can you give some reference or example available. – MiralShah Sep 06 '21 at 13:02
  • Look at `Parallel.For` or `task.WhenAll` instead of manually creating threads. But start by looking at [faster file enumeration](https://stackoverflow.com/questions/17756042/improve-the-performance-for-enumerating-files-and-folders-using-net). – JonasH Sep 06 '21 at 13:02
  • `Task.Run` isn't synchronous. The code executes in the background. `Wait()` is a blocking call though. if you want to scan multiple paths concurrently you need to start multiple tasks and await all of them with `await Task.WhenAll()` – Panagiotis Kanavos Sep 06 '21 at 13:03
  • 1
    @JonasH I have used win32 API for iterating the folder as it is 20 time faster than regular inbuilt function available in .net – MiralShah Sep 06 '21 at 13:05
  • 2
    Don't expect huge improvements from running scans in parallel. A network drive might be limited by network and disc IO limits with regards to concurrent access. Do some rough tests before going all out on coding tasks. – Emond Sep 06 '21 at 13:05
  • @PanagiotisKanavos you got me right , thanks any example if you can share :) – MiralShah Sep 06 '21 at 13:06
  • @EmondErno you are right , but let me achieve parallel or async task before reaching conclusion – MiralShah Sep 06 '21 at 13:09
  • My point is that a simple, rough test might make it clear if it is possible/worth it at all. I'd split it up in two two parallel threads and measure to see if there are any improvements before diving in and writing a bunch of code that cannot succeed. Measure before optimizing. Note that I am not saying that it is impossible. – Emond Sep 06 '21 at 14:10
  • Do keep in mind that starting a thread allocates a 1MB stack for each one. If you have a million folders then you're going to kill your process. – Enigmativity Sep 07 '21 at 01:04
  • @Enigmativity thx for information!! Any alternate solution – MiralShah Sep 07 '21 at 04:41

2 Answers2

-1
var threads = new List<Thread>();

foreach (var folder in foldersToScan)
{
    var thread = new Thread(() => IterateFolder(findInfoLevel, additionalFlags, folder))
    {
        Name = "Thread " + folder
    };

    thread.Start();
    Console.WriteLine(thread.Name.ToLower() + " has started");
    threads.Add(thread);
}

foreach(var thread in threads)
{
    thread.Join();
}

This way the code waits/blocks until all threads are done. You could/should make this more robust: error handling, time-outs or switching to Tasks.

Emond
  • 50,210
  • 11
  • 84
  • 115
  • is it is working asynchronously :) what trick :) Many Thanks , can you elaborate more about things you mentioned in comment for making robust and learning example online available :) – MiralShah Sep 06 '21 at 14:31
  • var threads = new Thread(() => IterateFolder(findInfoLevel, additionalFlags, folder)) { Name = "Thread " + folder }; in above code just typo mistake remove "s" in threads – MiralShah Sep 06 '21 at 14:33
  • Fixed. Thanks! As for adding improvements: thread.Join can take a time-out as a parameter and will return a boolean that indicates if the thread returned within the time out. So that would allow you to check for that and add an if in the foreach loop. Adding an exception handler (try...catch) can be done in the loop as well. I am not adding these because how you code them depends largely on what the program needs to happen in those cases (retry, resume, fail, abort) and that would distract from the answer to the original question. – Emond Sep 06 '21 at 14:53
-1

You can also use Linq:

foldersToScan.AsParallel().Where( f => IterateFolder( findInfoLevel, additionalFlags, f ) );

Here are some more examples how to use AsParallel() from Linq.

To make it asynchron, you can put this in one single BackgroundWorker or Thread.

izlin
  • 2,129
  • 24
  • 30