0

I have Winforms application that read several network folders and search for files inside this folders, this function receive List<stirng> folders:

private decimal _numberOfFiles;
private static List<string> _folders;
public delegate void OnFileAddDelegate(List<string> files);
public event OnFileAddDelegate OnFileAddEventHandler;
public delegate void OnFinishSearchDelegate();
public event OnFinishSearchDelegate OnFinishSearchEventHandler;

public void SearchFiles()
{
    foreach (string folder in _folders)
    {
        if (Directory.Exists(folder))
        {
            var files = Directory.EnumerateFiles(folder, "*.doc", SearchOption.TopDirectoryOnly)
                .OrderByDescending(x => new FileInfo(x).CreationTime).Take((int)_numberOfFiles).ToList<string>();
            if (OnFileAddEventHandler != null)
                OnFileAddEventHandler(files);
        }
    }

    if (OnFinishSearchEventHandler != null)
        OnFinishSearchEventHandler();
}

After OnFileAddEventHandler(files) event is fired my ProducerConsumer class start to check this List of files that found and do the work (if the file is OK fired up event to my main UI to add this files into my ListView):

public class ProducerConsumer
{
    public delegate void OnFileAddDelegate(PcapFileDetails pcapFileDetails);
    public event OnFileAddDelegate OnFileAddEventHandler;
    public delegate void AllFilesProcessedDelegate();
    public event AllFilesProcessedDelegate AllFilesProcessedEventHandler;
    private readonly Queue<string> _queue;
    private int counter;

    public ProducerConsumer(int workerCount, IEnumerable<string> list)
    {
        _isSearchFinished = true;
        _queue = new Queue<string>(list); // fill the queue
        counter = _queue.Count; // set up counter
        for (int i = 0; i < workerCount; i++)
            Task.Factory.StartNew(Consumer);
    }

    private void Consumer()
    {
        FileChecker fileChecker = new FileChecker();
        for (; ; )
        {
            string file;
            lock (_queue)
            {
                // synchronize on the queue
                if (_queue.Count == 0) return;  // we are done
                file = _queue.Dequeue(); // get file name to process
            } // release the lock to allow other consumers to access the queue
            // do the job
            string result = fileChecker.Check(file); // Check my file

            if (OnFileAddEventHandler != null && result ) // In case my file OK, fired up event to my main UI
                OnFileAddEventHandler(file);

            // decrement the counter
            if (Interlocked.Decrement(ref counter) != 0)
                continue; // not the last

            // all done - we were the last
            if (AllFilesProcessedEventHandler != null)
                AllFilesProcessedEventHandler();
            return;
        }
    }
}

Now while this search is in process my UI is locked to prevent unnecessary clicks and i want to know when all my folders finish to search to unlock. But my problem is because i am search several folders the event AllFilesProcessedEventHandler() fired up several times and i want to know when all my searches finished.

Babak Naffas
  • 12,395
  • 3
  • 34
  • 49

1 Answers1

0

Here is a recursive sample with QuickIO.Net

using System;
using System.Collections.Concurrent;
using System.Threading;
using System.Threading.Tasks;
using SchwabenCode.QuickIO;

namespace ConsoleApplication3
{
    internal class Program
    {
        private static readonly BlockingCollection<QuickIOFileInfo> fileInfos = new BlockingCollection<QuickIOFileInfo>();
        private static void Main(string[] args)
        {
            var task = Task.Factory.StartNew(() =>
            {
                Int32 totalSize = 0;
                Parallel.ForEach(fileInfos.GetConsumingEnumerable(), fi =>
                {
                    Interlocked.Add(ref totalSize, (int)fi.Bytes);
                });
                Console.WriteLine("All docs bytes amount to {0}", totalSize);
            });

            ProcessDirectory("C:\\");
            fileInfos.CompleteAdding();

            Task.WaitAll(task);
        }

        private static void ProcessDirectory(string path)
        {
            Parallel.ForEach(QuickIODirectory.EnumerateDirectories(path), dir =>
            {
                try
                {
                    Parallel.ForEach(QuickIODirectory.EnumerateFiles(dir), file =>
                    {
                        if (file.AsFileInfo().Extension.Equals(".docx"))
                            fileInfos.Add(file);
                    });
                    ProcessDirectory(dir.FullName);
                }
                catch (Exception e)
                {
                    Console.WriteLine("Unable to access directory {0}", dir.FullName);
                }
            });
        }
    }
}

Blocking collection will automatically signal to Parallel ForEach when all elements have been added, by calling CompleteAdding().

To scan a 256GB SSD, with 74GB left and total of 738k+ files took 16.8s.

Darek
  • 4,687
  • 31
  • 47
  • Why use `Interlocked.Add`? A simple Map-Reduce seems more efficient, and non-blocking. The only reason for an `Interlocked.Add` is if you want to check the running tally. Even then it is possible to batch your Map-Reduce (slightly harder, but do able). – Aron Sep 10 '14 at 17:18
  • True, but the sample by no means is production quality code. It was just to demonstrate how to proceed with his challenge without reinventing the blocking collection from scratch. – Darek Sep 10 '14 at 17:33
  • `.AsParallel().Sum(fi => (int)fi.Bytes)` – Aron Sep 10 '14 at 17:36
  • @Aron thanks for your comments. I suspect user3637066 will want to do more than just sum the size of files. – Darek Sep 10 '14 at 17:48
  • What's is QuickIOFileInfo ? – user3637066 Sep 10 '14 at 17:53
  • A superfast IO library. http://quickio.net/ Just add it to your project `Install-package QuickIO.Net`. 44,000 files per second on my SSD. – Darek Sep 10 '14 at 17:55
  • @Aron BTW, using Map-Reduce might have shortened the time by 20%. But than again, user is not just summarizing sizes. Nice suggestion nevertheless. – Darek Sep 10 '14 at 18:28
  • I don't need the recursion so i moved the first Parallel.ForEach and it crashed – user3637066 Sep 10 '14 at 21:03
  • Since I don't know how you "moved", I can't really tell why it crashed. Sounds to me you need to play with C# a little and gain more experience. Keep up the good work! – Darek Sep 10 '14 at 21:11