5

According to the Microsoft Docs site for Directory.EnumerateFiles, the search pattern parameter will match any extension beginning with the specified pattern when it is exactly 3 characters. However, this is not working on file shares, only local drives.

For a directory of \\share\folder\ containing a single file named file.xlsx, this first code snippet does not return it:

public static List<string> GetAllFilesFromDirectory(string directory) =>
   new[] { "*.csv", "*.xls", "*.txt" }.SelectMany(ext => Directory.EnumerateFiles(directory, ext)).ToList();

However, if I add the *.xlsx pattern, it does return it:

public static List<string> GetAllFilesFromDirectory(string directory) =>
   new[] { "*.csv", "*.xls", "*.xlsx", "*.txt" }.SelectMany(ext => Directory.EnumerateFiles(directory, ext)).ToList();

I also tested this with the same file in the C:\temp directory and it found returned it both ways.

This is running in a .NET Framework 4.7.2 console app.

Am I missing something in the search pattern? Or does this not work with file shares the same way as local drives? Would this be expected?

Scott Hoffman
  • 370
  • 2
  • 10

1 Answers1

7

You must be the unluckiest person, to have hit this bug. I can confirm that it behaves as per your observation, and also couldn't find any references to this anywhere on the interwebs.

So I traced the .NET source code to see how Directory.EnumerateFiles works and - deep within the bowels - eventually ran into a call to FindFirstFile and subsequent FindNextFile calls. These were PInvoked directly from the kernel, so you can't get any lower than that.

[DllImport("kernel32.dll", CharSet = CharSet.Unicode)]
public static extern IntPtr FindFirstFile(string lpFileName, out WIN32_FIND_DATA lpFindFileData);

Well gotta test that then. Guess what? It catches the XLSX file in local directories, but not in network shares.

The doc for the function does not mention this behaviour either. So yeah. You've just hit an undocumented "feature" :)

Edit: This just got better. Looks like in .NET Core (from 2.0 all the way to .NET 5) this behaviour isn't there anymore. They actually wrote their own pattern matcher this time round. *.xls would not catch XLSX in any folders, local or otherwise. Yet their documentation still says that it should.

Edit 2021: The doco has now been updated with a remark about the quirk on .NET Framework.


Here's my test code for the FindFirstFile call:

public class Program
{
    public static void Main(string[] args)
    {
        // Ensure these test folders only contain ONE file.
        // Name the file "Test.xlsx"
        Test(@"C:\Temp\*.xls"); // Finds the xlsx file just fine
        Test(@"\\Server\Temp\*.xls"); // But not here!
    }

    public static void Test(string fileName)
    {
        Win32Native.WIN32_FIND_DATA data;
        var hnd = Win32Native.FindFirstFile(fileName, out data);

        if (hnd == Win32Native.InvalidPtr) 
            Debug.WriteLine("Not found!!");
        else
            Debug.WriteLine("Found: " + data.cFileName);
    }
}

/** Windows native Pinvoke **/
public class Win32Native
{
    public static IntPtr InvalidPtr = new IntPtr(-1);

    [DllImport("kernel32.dll", CharSet = CharSet.Auto)]
    public static extern IntPtr FindFirstFile(string lpFileName, out WIN32_FIND_DATA lpFindFileData);

    [StructLayout(LayoutKind.Sequential, CharSet = CharSet.Auto)]
    public struct WIN32_FIND_DATA
    {
        public uint dwFileAttributes;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftCreationTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastAccessTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastWriteTime;
        public uint nFileSizeHigh;
        public uint nFileSizeLow;
        public uint dwReserved0;
        public uint dwReserved1;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 260)]
        public string cFileName;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 14)]
        public string cAlternateFileName;
    }
}
NPras
  • 3,135
  • 15
  • 29
  • Good investigation. I wonder if we'd see a difference if we were to use the wide versions of the calls explicitly. I suspect depending on which is used, it calls legacy code that isn't long filename aware and somehow truncates the extension. – Jeff Mercado Oct 28 '20 at 07:24