0
  • I have a base folder under a drive Data and under this I have around 100 folders.

enter image description here

  • In each folder Folder1.....100, one of the 3rd part application pushing zip file (zip contains 1 or more files).

  • I have to write a window service which will watch all 100 folders for file arrival.

  • Once file is available I need to extract the zip file and placing all the extracted files into a second folder and this I need to do for each folder (Folder 1 .. 100) as soon as files available.

  • Below code suggest me that through C# FileSystemWatcher, I can watch one folder at a time and act on that.

Question is, how to do watch for 100 folders in parallel?

 class ExampleAttributesChangedFiringTwice
{
    public ExampleAttributesChangedFiringTwice(string demoFolderPath)
    {
        var watcher = new FileSystemWatcher()
        {
            Path = demoFolderPath,
            NotifyFilter = NotifyFilters.LastWrite,
            Filter = "*.txt"
        };

        watcher.Changed += OnChanged;
        watcher.EnableRaisingEvents = true;
    }

    private static void OnChanged(object source, FileSystemEventArgs e)
    {
        // extract zip file, do the validation, copy file into other destination
    }
}

The target folder, is it the same folder regardless of the source folder of the zip? That is, it doesn't matter if it's from Folder1 or Folder2, both will be extracted to FolderX?

Target folder is common for all "C:\ExtractedData".

So every folder under Data will be watched? No "blacklisted" folder? What about if a zip appears in Data itself instead of its subfolder? What if a new subfolder is created, should it be watched too?

"zip" always comes inside "subfolders", it will never create inside Data folder. Yes, there is a chance in future, more subfolders will come and need watch.

And does the extracted files goes into a separate subfolder inside the target folder based on their zip filename, or do they just get extracted on the target folder, eg, if it's A.zip, does the content goes to Target\A or just Target.

For example, if A.zip contains 2 files, "1.txt" and "2.txt", then both files goes to "C:\ExtractedData". This will be common for each zip files arrives at different subfolders.

user584018
  • 10,186
  • 15
  • 74
  • 160
  • 1
    Setup 100 `FileSystemWatcher` instances. – mjwills Aug 31 '20 at 04:04
  • 3
    If those hundreds of folders are under a single folder (or even drive), [IncludeSubdirectories](https://learn.microsoft.com/en-us/dotnet/api/system.io.filesystemwatcher.includesubdirectories) will enable you to have only one watcher active, then filter events by their path – Martheen Aug 31 '20 at 04:05
  • 1
    Also How often are these files updated? added, or changed ect? as you might need to increase its buffer if these directories are being thrashed. – TheGeneral Aug 31 '20 at 04:10
  • @Michael Randall, files are not modified, only added to the directory. – user584018 Aug 31 '20 at 04:15
  • @Martheen, yes, all 100 folders are in a single folder. Could you please give some more guide on " filter events by their path"? – user584018 Aug 31 '20 at 04:16
  • Are there lots of files added per minute? – TheGeneral Aug 31 '20 at 04:17
  • Anyway the documentation here https://learn.microsoft.com/en-us/dotnet/api/system.io.filesystemwatcher?view=netcore-3.1, it has examples and all the information you need to use it – TheGeneral Aug 31 '20 at 04:17
  • 1
    HashSet's [Contains](https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.hashset-1.contains) is O(1) operation, just put the path there, if you have different action depending on the path, setup a Dictionary with the key being the path and the value being the delegate. – Martheen Aug 31 '20 at 04:30
  • Thanks @ Martheen, if you can put some code. – user584018 Aug 31 '20 at 05:23
  • Update your question with more details, such as whether the 100 folder list is available in a line/comma/tab separated list txt file, DB, API call etc, if under the watched folder there's only few or none subfolder that don't need to be watched, if you have different action depending on the path, etc, since the answer will depend on that – Martheen Aug 31 '20 at 05:40
  • @ Martheen, Thanks! My use case is very simple. I have base folder and under that I have 100 subfolders where 3rd party app is pushing files (24*7). I need to watch each folder, look for zip file, extract it , place in different folder and delete the zip file. Same activity for all folders. I edited my question. Do let me know for any further. Thanks! – user584018 Aug 31 '20 at 06:46
  • 1
    The target folder, is it the same folder regardless of the source folder of the zip? That is, it doesn't matter if it's from Folder1 or Folder2, both will be extracted to FolderX? – Martheen Aug 31 '20 at 07:48
  • 1
    So *every* folder under Data will be watched? No "blacklisted" folder? What about if a zip appears in Data itself instead of its subfolder? What if a new subfolder is created, should it be watched too? – Martheen Aug 31 '20 at 07:49
  • 1
    And does the extracted files goes into a separate subfolder inside the target folder based on their zip filename, or do they just get extracted on the target folder, eg, if it's A.zip, does the content goes to Target\A or just Target – Martheen Aug 31 '20 at 07:56
  • @Martheen, Thanks for all points. I edit my question, where I gave clarification for my need. Thanks for your time. Appreciate! – user584018 Aug 31 '20 at 08:19

1 Answers1

3

The "100 folders in parallel" part turn out to be a red herring. Since all the new zip files are treated the same regardless of where they show up, just adding IncludeSubdirectories=true is enough. Note the following codes are prone to exceptions, read the comments

class WatchAndExtract
{
    string inputPath, targetPath;
    public WatchAndExtract(string inputPath, string targetPath)
    {
        this.inputPath = inputPath;
        this.targetPath = targetPath;
        var watcher = new FileSystemWatcher()
        {
            Path = inputPath,
            NotifyFilter = NotifyFilters.FileName,
            //add other filters if your 3rd party app don't immediately copy a new file, but instead create and write
            Filter = "*.zip",
            IncludeSubdirectories = true
        };
        watcher.Created += OnCreated; //use Changed if the file isn't immediately copied
        watcher.EnableRaisingEvents = true;
    }

    private void OnCreated(object source, FileSystemEventArgs e)
    {
        //add filters if you're using Changed instead 
        //https://stackoverflow.com/questions/1764809/filesystemwatcher-changed-event-is-raised-twice
        ZipFile.OpenRead(e.FullPath).ExtractToDirectory(targetPath);
        //this will throw exception if the zip file is being written.
        //Catch and add delay before retry, or watch for LastWrite event that already passed for a few seconds
    }
}

If it skipped some files, you either have too many files created at once and/or too big zip to process. Either increase the buffer size or start them in new thread. On HDD with busy IO or extremely large zip files, the events might exceed the storage capability and skipped files after a prolonged busy period, you'll have to consider writing to a different physical (not just a different partition in the same device) drive instead. Always verify with your predicted usage pattern.

Martheen
  • 5,198
  • 4
  • 32
  • 55