0

The class written below exposes two public methods: MapLocalFiles() and MapLocalFilesAsync(). They simply populate a dictionary with filenames as keys and list of filepath as values. The first works correctly but the latter fails throwing ThreadAbortException and I cannot identify the problem.

public class FileMapper
{
    private string rootFolder;

    public FileMapper(string rootFolder)
    {
        this.rootFolder = rootFolder;
    }

    private ConcurrentDictionary<string, List<string>> Files { get; } = new ConcurrentDictionary<string, List<string>>(StringComparer.OrdinalIgnoreCase);
    
    public void MapLocalFiles()
    {
        TraverseFolder(rootFolder);
        Debug.WriteLine(nameof(MapLocalFiles) + " has terminated");
    }

    public async void MapLocalFilesAsync()
    {
        await Task.Run(() => TraverseFolder(rootFolder));
        Debug.WriteLine(nameof(MapLocalFilesAsync) + " has terminated");
    }

    private void TraverseFolder(string dirPath)
    {
        string[] filesPaths;

        filesPaths = Directory.GetFiles(dirPath);

        foreach (var filePath in filesPaths)
        {
            string filename = null;
            filename = Path.GetFileName(filePath);
            
            Debug.WriteLine(filePath);

            if (Files.ContainsKey(filename))
            {
                Files[filename].Add(filePath);
            }
            else
            {
                Files[filename] = new List<string> { filePath };
            }
        }

        var directories = Directory.GetDirectories(dirPath);

        foreach (var directory in directories)
        {
            TraverseFolder(directory);
        }
    }
}

The method using it is a test method simply as:

[TestMethod]
public void TestFolderMapper()
{
    FileMapper fileMapper = new FileMapper(@"D:\temp");
    //fileMapper.MapLocalFiles(); //Ok Works
    fileMapper.MapLocalFilesAsync(); //Quickly fails with ThreadAbortException
}

Thank you.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Paolo
  • 173
  • 2
  • 9
  • 7
    Why aren't you awaiting the outer call? `await fileMapper.MapLocalFilesAsync(); //Fails` – Rand Random Jul 19 '23 at 10:28
  • was just about to say that – Dean Van Greunen Jul 19 '23 at 10:28
  • 2
    It's `async void`, which is likely the cause of the problem. – Stephen Cleary Jul 19 '23 at 10:59
  • 1
    Your test harness is killing the thread after you call a fire-and-forget `fileMapper.MapLocalFilesAsync()`. – Enigmativity Jul 19 '23 at 13:04
  • 2
    Be aware that the `List` is not thread-safe, so the `ConcurrentDictionary>` as a whole is not thread-safe either. Your test doesn't make concurrent use of the `Files`, so the thread-safety problem is not detected. – Theodor Zoulias Jul 19 '23 at 13:22
  • 1
    Also you might find this interesting: [Parallel tree traversal in C#](https://stackoverflow.com/questions/7099703/parallel-tree-traversal-in-c-sharp). – Theodor Zoulias Jul 19 '23 at 13:24
  • 2
    I would recommend making your `TraverseFolder` create and return a dictionary instead of modifying a shared dictionary. Keeping resources as local as possible greatly simplifies thread safety. This would eliminate the thread safety problems Theodor mentions. – JonasH Jul 19 '23 at 14:03
  • Why not tell GetFiles or better yet, EnumerateFiles to search the folders recursively and then use `Parallel.ForEachAsync` to process the files concurrently? All tasks are reading from the same disk and while concurrency could offer a small performance increase it won't matter a lot. – Panagiotis Kanavos Jul 20 '23 at 13:41
  • 1
    What *could* help is to use `DirectoryInfo.EnumerateDirectories("*",SearchOption.EnumerateDirectories)` to get all directories and then call `DirectoryInfo.EnumerateFiles` on each, in parallel. In NTFS, each directory is essentially a file with metadata containing file entries – Panagiotis Kanavos Jul 20 '23 at 13:44

2 Answers2

2

While your own answer may work for you, a better way to do this is to async your test. First keep MapLocalFilesAsync that returns Task

public Task MapLocalFilesAsync()
{
    return Task.Run(() => TraverseFolder(rootFolder));
}

and then add async your test. Most, if not all, testing frameworks support this.

[TestMethod]
public async Task TestFolderMapper()
{
    FileMapper fileMapper = new FileMapper(@"D:\temp");
    await fileMapper.MapLocalFilesAsync();
}

If you later change your TraverseFolder method to asynchronously iterate, you can call it directly:

public async Task TraverseFolder()
{
    // use rootFolder ...
    await ...
    ...
}

...
 
[TestMethod]
public async Task TestFolderMapper()
{
    FileMapper fileMapper = new FileMapper(@"D:\temp");
    await fileMapper.TraverseFolder();
}

See Stephen Cleary's Async All the Way.

Kit
  • 20,354
  • 4
  • 60
  • 103
0

Thank you. "await" keyword worked. The calling method was a TestMethod in a TestClass. After calling MapLocalFilesAsync(), which initiated a new worker thread, It exited before the worker thread completed causing, I guess, thread abortion.

To be able to await the working thread from the calling method, I modified the MapLocalFilesAsync() as:

public Task MapLocalFilesAsync()
{
    return Task.Run(() => TraverseFolder(rootFolder));
}

and the calling function as:

[TestMethod]
public void TestFolderMapper()
{
    FileMapper fileMapper = new FileMapper(@"D:\temp");
    fileMapper.MapLocalFilesAsync().Wait(); //In test method async/await sintax does not work.
}

Now works as expected.

Paolo
  • 173
  • 2
  • 9
  • 1
    If you only want to get all files in the background, you could use `var files=await Task.Run(()=>DirectoryEnumerateFiles("*",SearchOptions.AllDirectories));` or the same with `DirectoryInfo`, to get the FileInfo objects. There's no need for recursive traversal – Panagiotis Kanavos Jul 20 '23 at 13:47