61

Is there a built-in mechanism in .NET to match patterns other than Regular Expressions? I'd like to match using UNIX style (glob) wildcards (* = any number of any character).

I'd like to use this for a end-user facing control. I fear that permitting all RegEx capabilities will be very confusing.

dmo
  • 3,993
  • 6
  • 35
  • 39
  • /bin/sh style wildcards are called 'glob's. Retagging. – dmckee --- ex-moderator kitten Oct 09 '08 at 19:48
  • 1
    regex may be confusing, but it's powerful. I usually allow both by checking for s.StartsWith('/') && s.EndsWith('/') – Jamie Pate Mar 09 '13 at 19:04
  • 1
    I have written a globbing library for .NET, with tests and benchmarks. My goal was to produce a library for .NET, with minimal dependencies, that doesn't use Regex, and significantly outperforms Regex. You can find it here: https://github.com/dazinator/DotNet.Glob – Darrell Dec 12 '16 at 14:05
  • Darrell - please put your answer as answer, not comment. First I was checking this question - haven't found even your answer. (Since it was in comments) Also people can vote for it if it's good. – TarmoPikaro Mar 05 '17 at 13:52
  • @Darrell I can I have tested all the answers here (as of Sep 2018) including `Microsoft.Extensions.FileSystemGlobbing`. So far `DotNet.Glob` is the best. – Anton Krouglov Sep 11 '18 at 15:20
  • @AntonKrouglov Thanks – Darrell Sep 11 '18 at 18:13
  • @TarmoPikaro - Ok - I have added an answer as suggested! – Darrell Sep 11 '18 at 18:13
  • @TarmoPikaro - well I tried to add it as an answer but a moderator decided to delete it because "Whilst this may theoretically answer the question, it would be preferable to include the essential parts of the answer here, and provide the link for reference" - oh well! – Darrell Sep 11 '18 at 18:31

15 Answers15

76

I like my code a little more semantic, so I wrote this extension method:

using System.Text.RegularExpressions;

namespace Whatever
{
    public static class StringExtensions
    {
        /// <summary>
        /// Compares the string against a given pattern.
        /// </summary>
        /// <param name="str">The string.</param>
        /// <param name="pattern">The pattern to match, where "*" means any sequence of characters, and "?" means any single character.</param>
        /// <returns><c>true</c> if the string matches the given pattern; otherwise <c>false</c>.</returns>
        public static bool Like(this string str, string pattern)
        {
            return new Regex(
                "^" + Regex.Escape(pattern).Replace(@"\*", ".*").Replace(@"\?", ".") + "$",
                RegexOptions.IgnoreCase | RegexOptions.Singleline
            ).IsMatch(str);
        }
    }
}

(change the namespace and/or copy the extension method to your own string extensions class)

Using this extension, you can write statements like this:

if (File.Name.Like("*.jpg"))
{
   ....
}

Just sugar to make your code a little more legible :-)

mindplay.dk
  • 7,085
  • 3
  • 44
  • 54
  • 2
    Excellent method. I would rename the parameter to `pattern` to avoid confusion that it's setting the wildcard characters themselves. – Eric Eskildsen May 13 '15 at 13:11
47

Just for the sake of completeness. Since 2016 in dotnet core there is a new nuget package called Microsoft.Extensions.FileSystemGlobbing that supports advanced globing paths. (Nuget Package)

some examples might be, searching for wildcard nested folder structures and files which is very common in web development scenarios.

  • wwwroot/app/**/*.module.js
  • wwwroot/app/**/*.js

This works somewhat similar with what .gitignore files use to determine which files to exclude from source control.

cleftheris
  • 4,626
  • 38
  • 55
  • 4
    I added this to my C# console app (50 lines of code) and NuGet pulled in 280 megabytes of package dependencies. So it may not be appropriate for all scenarios (or if anyone knows how to slim it down...) – Peter Hull Jul 11 '17 at 09:12
  • Can I use this package for old .net 4.6? – Denis535 Nov 28 '17 at 15:16
  • @wishmaster35 although you can not use the latest version of the lib since it is build against NetStandard 2.0 you can still use an older one ([v1.1.1](https://www.nuget.org/packages/Microsoft.Extensions.FileSystemGlobbing/1.1.1)) that can be used with NetFramework 4.5. Check it out – cleftheris Nov 29 '17 at 10:33
  • 3
    Beware that as of Dec 2018 `Microsoft.Extensions.FileSystemGlobbing` still does not work with "made up" files (that is remote or offloaded files). See https://github.com/aspnet/Extensions/issues/848 For this particular case I have used DotNet.Glob nuget package (https://stackoverflow.com/a/52281887/2746150) which is fast and tiny. – Anton Krouglov Dec 27 '18 at 14:48
  • Is there an example that shows this when enumerating files or folders? I don't want to enumerate my entire C: drive and compare every file against the glob, I would rather the library be smart enough to only enumerate files and folders in the most specific sub-folder before any kind of **, etc. – jjxtra Apr 13 '20 at 22:19
  • @jjxtra the execute method on the `Matcher` class takes as argument a `DirectoryInfo` so this should solve your issue [Matcher.Execute()](https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.filesystemglobbing.matcher.execute?view=dotnet-plat-ext-3.1#Microsoft_Extensions_FileSystemGlobbing_Matcher_Execute_Microsoft_Extensions_FileSystemGlobbing_Abstractions_DirectoryInfoBase_). Although this should be an other question on stackoverflow – cleftheris Apr 14 '20 at 09:13
  • 2
    I ended up searching for `/*` and using everything up to that index (if it exists) as the base path, and everything after as the glob. Seems to work ok. – jjxtra Apr 14 '20 at 18:29
40

I found the actual code for you:

Regex.Escape( wildcardExpression ).Replace( @"\*", ".*" ).Replace( @"\?", "." );
Jonathan C Dickinson
  • 7,181
  • 4
  • 35
  • 46
  • 5
    you probably also want to tack a "^" before and a "$" at the end of that to mimic UNIX/DOS globbing, right? – yoyoyoyosef Oct 23 '08 at 18:50
  • You may be right, I just copied that code verbatim (my regex isn't really as good as it should be). – Jonathan C Dickinson Oct 31 '08 at 06:51
  • 3
    I think I would replace "\*" with @"[^\\\.]*" - implying, any character except dot or slash, which are meaningful in the filename format. – Cheeso Mar 02 '09 at 21:55
  • Note: this works for *nix, while in many cases, Windows works differently: http://stackoverflow.com/a/16488364/119561 – deerchao May 10 '13 at 18:37
  • 2
    To properly support wildcard escapes and stuff, you need something more complicated than `string.Replace()`. This code would turn user-supplied `\*` into a Regex of `\\.*` which wouldn’t match the input string `*`. – binki Aug 24 '15 at 16:12
  • @Cheeso, only if you supplied `FNM_FILENAME|FNM_PERIOD` to [`fnmatch()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/fnmatch.html). Also, no need to escape `.` inside `[.]` :-p. – binki Aug 24 '15 at 16:16
  • @binki - still need "[^\\]*", or "[^/\\]*" if you need to support non-DOS file paths. "foo\*.txt" should not match "foo\bar\baz.txt" – Tim Sparkles Nov 03 '16 at 00:42
  • @Timbo But that’s a DOS path. In unix, `foo*.txt` **should** match `foo\bar\baz.txt`. See https://gist.github.com/binki/d3dbc03b7f87c8c5d4e8d7cd7b74b8bd . Also, I’m convinced that `^[^\]*$` would match strings like `[`, `[]`, `[]]`, and so on. Let’s see: `$ expr '[' : '[^\]*'` yields `1`. Just as I thought, escaping `]` causes the `[` to no longer be treated like a special character ;-). EDIT: Actually, it seems to cause undefined behavior in `expr(1)`. Weird, heh. – binki Nov 03 '16 at 04:14
  • That is an excellent point. On unix, "foo*.txt" should not match "foo/bar/baz.txt" but it should match "foo\bar\baz.txt" because backslash is a valid filename character. On DOS, neither slash nor backslash is a valid filename character. It all depends on your use case. I'm not concerned with path validation, only with path matching, where candidate and pattern are both assumed to be valid. With your use case, wouldn't ".*" still have the same issue? – Tim Sparkles Nov 04 '16 at 02:01
  • Is there an example that shows this when enumerating files or folders? I don't want to enumerate my entire C: drive and compare every file against the glob, I would rather the library be smart enough to only enumerate files and folders in the most specific sub-folder before any kind of **, etc. – jjxtra Apr 13 '20 at 21:51
11

The 2- and 3-argument variants of the listing methods like GetFiles() and EnumerateDirectories() take a search string as their second argument that supports filename globbing, with both * and ?.

class GlobTestMain
{
    static void Main(string[] args)
    {
        string[] exes = Directory.GetFiles(Environment.CurrentDirectory, "*.exe");
        foreach (string file in exes)
        {
            Console.WriteLine(Path.GetFileName(file));
        }
    }
}

would yield

GlobTest.exe
GlobTest.vshost.exe

The docs state that there are some caveats with matching extensions. It also states that 8.3 file names are matched (which may be generated automatically behind the scenes), which can result in "duplicate" matches in given some patterns.

The methods that support this are GetFiles(), GetDirectories(), and GetFileSystemEntries(). The Enumerate variants also support this.

Luke Woodward
  • 63,336
  • 16
  • 89
  • 104
Dan Mangiarelli
  • 961
  • 9
  • 10
7

If you want to avoid regular expressions this is a basic glob implementation:

public static class Globber
{
    public static bool Glob(this string value, string pattern)
    {
        int pos = 0;

        while (pattern.Length != pos)
        {
            switch (pattern[pos])
            {
                case '?':
                    break;

                case '*':
                    for (int i = value.Length; i >= pos; i--)
                    {
                        if (Glob(value.Substring(i), pattern.Substring(pos + 1)))
                        {
                            return true;
                        }
                    }
                    return false;

                default:
                    if (value.Length == pos || char.ToUpper(pattern[pos]) != char.ToUpper(value[pos]))
                    {
                        return false;
                    }
                    break;
            }

            pos++;
        }

        return value.Length == pos;
    }
}

Use it like this:

Assert.IsTrue("text.txt".Glob("*.txt"));
Tony Edgecombe
  • 3,860
  • 3
  • 28
  • 34
  • I notice the use of `char.ToUpper` method. Does this mean that this solution is case insensitive, such that `Assert.IsTrue("TEXT.TXT".Glob("*.txt"));` also passes? – Dan Stevens Feb 22 '23 at 12:46
  • I really like this. This is all the code necessary, that's simple and easy to read, uses recursion, and seems to handle all the edge cases I can imagine at a glance. – Tom Mayfield May 16 '23 at 16:37
5

If you use VB.Net, you can use the Like statement, which has Glob like syntax.

http://www.getdotnetcode.com/gdncstore/free/Articles/Intoduction%20to%20the%20VB%20NET%20Like%20Operator.htm

torial
  • 13,085
  • 9
  • 62
  • 89
  • this is exactly what I'm looking for, but is it available in C#? – dmo Oct 09 '08 at 20:06
  • The closest you'll get w/ C# (aside from implementing it yourself) is to use Linq: http://books.google.com/books?id=zQvH6HuZk50C&pg=RA2-PA104&lpg=RA2-PA104&dq=.net+like+keyword+linq&source=web&ots=w9whKY2-xK&sig=xUTjJcQ0kK9oTxMKD5qok4dIDa8&hl=en&sa=X&oi=book_result&resnum=7&ct=result#PRA2-PA105,M1 – torial Oct 09 '08 at 20:18
  • Otherwise, you'll need to write the module in VB.Net as a DLL project, and reference the DLL in C#. VB.Net users have to do that to take advantage of the yield return statement. – torial Oct 09 '08 at 20:20
  • 1
    The link above from torial is specific to VB.Net as well. – dmo Oct 09 '08 at 21:00
  • I've added an answer that shows how to use VB's LIKE implementation from C# without having to build or deploy a custom VB DLL. It just uses the Microsoft.VisualBasic.dll that's included with the .NET Framework. – Bill Menees Aug 18 '16 at 18:08
5

I have written a globbing library for .NETStandard, with tests and benchmarks. My goal was to produce a library for .NET, with minimal dependencies, that doesn't use Regex, and outperforms Regex.

You can find it here:

Anton Krouglov
  • 3,077
  • 2
  • 29
  • 50
Darrell
  • 1,905
  • 23
  • 31
4

I wrote a FileSelector class that does selection of files based on filenames. It also selects files based on time, size, and attributes. If you just want filename globbing then you express the name in forms like "*.txt" and similar. If you want the other parameters then you specify a boolean logic statement like "name = *.xls and ctime < 2009-01-01" - implying an .xls file created before January 1st 2009. You can also select based on the negative: "name != *.xls" means all files that are not xls.

Check it out. Open source. Liberal license. Free to use elsewhere.

Aristos
  • 66,005
  • 16
  • 114
  • 150
Cheeso
  • 189,189
  • 101
  • 473
  • 713
2

Based on previous posts, I threw together a C# class:

using System;
using System.Text.RegularExpressions;

public class FileWildcard
{
    Regex mRegex;

    public FileWildcard(string wildcard)
    {
        string pattern = string.Format("^{0}$", Regex.Escape(wildcard)
            .Replace(@"\*", ".*").Replace(@"\?", "."));
        mRegex = new Regex(pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
    }
    public bool IsMatch(string filenameToCompare)
    {
        return mRegex.IsMatch(filenameToCompare);
    }
}

Using it would go something like this:

FileWildcard w = new FileWildcard("*.txt");
if (w.IsMatch("Doug.Txt"))
   Console.WriteLine("We have a match");

The matching is NOT the same as the System.IO.Directory.GetFiles() method, so don't use them together.

Doug Clutter
  • 3,646
  • 2
  • 29
  • 31
  • Nice code, but it doesn't seem to like filename extensions longer than 3 chars. If I try to do IsMatch on a filename like "mike.xls?" then it'll fail on "mike.xlsx". If I use "mike.xl*" as the wildcard, it works okay though. – Mike Gledhill Mar 22 '12 at 14:02
2

From C# you can use .NET's LikeOperator.LikeString method. That's the backing implementation for VB's LIKE operator. It supports patterns using *, ?, #, [charlist], and [!charlist].

You can use the LikeString method from C# by adding a reference to the Microsoft.VisualBasic.dll assembly, which is included with every version of the .NET Framework. Then you invoke the LikeString method just like any other static .NET method:

using Microsoft.VisualBasic;
using Microsoft.VisualBasic.CompilerServices;
...
bool isMatch = LikeOperator.LikeString("I love .NET!", "I love *", CompareMethod.Text);
// isMatch should be true.
Bill Menees
  • 2,124
  • 24
  • 25
2

https://www.nuget.org/packages/Glob.cs

https://github.com/mganss/Glob.cs

A GNU Glob for .NET.

You can get rid of the package reference after installing and just compile the single Glob.cs source file.

And as it's an implementation of GNU Glob it's cross platform and cross language once you find another similar implementation enjoy!

  • Worked like a charm for me. Other options to expand Glob patterns were cumbersome. One line (sic!): `var dlls = Glob.Expand(@"c:\windows\system32\**\*.dll")` – Anton Krouglov Dec 27 '18 at 15:45
1

I don't know if the .NET framework has glob matching, but couldn't you replace the * with .*? and use regexes?

Ferruccio
  • 98,941
  • 38
  • 226
  • 299
0

Just out of curiosity I've glanced into Microsoft.Extensions.FileSystemGlobbing - and it was dragging quite huge dependencies on quite many libraries - I've decided why I cannot try to write something similar?

Well - easy to say than done, I've quickly noticed that it was not so trivial function after all - for example "*.txt" should match for files only in current directly, while "**.txt" should also harvest sub folders.

Microsoft also tests some odd matching pattern sequences like "./*.txt" - I'm not sure who actually needs "./" kind of string - since they are removed anyway while processing. (https://github.com/aspnet/FileSystem/blob/dev/test/Microsoft.Extensions.FileSystemGlobbing.Tests/PatternMatchingTests.cs)

Anyway, I've coded my own function - and there will be two copies of it - one in svn (I might bugfix it later on) - and I'll copy one sample here as well for demo purposes. I recommend to copy paste from svn link.

SVN Link:

https://sourceforge.net/p/syncproj/code/HEAD/tree/SolutionProjectBuilder.cs#l800 (Search for matchFiles function if not jumped correctly).

And here is also local function copy:

/// <summary>
/// Matches files from folder _dir using glob file pattern.
/// In glob file pattern matching * reflects to any file or folder name, ** refers to any path (including sub-folders).
/// ? refers to any character.
/// 
/// There exists also 3-rd party library for performing similar matching - 'Microsoft.Extensions.FileSystemGlobbing'
/// but it was dragging a lot of dependencies, I've decided to survive without it.
/// </summary>
/// <returns>List of files matches your selection</returns>
static public String[] matchFiles( String _dir, String filePattern )
{
    if (filePattern.IndexOfAny(new char[] { '*', '?' }) == -1)      // Speed up matching, if no asterisk / widlcard, then it can be simply file path.
    {
        String path = Path.Combine(_dir, filePattern);
        if (File.Exists(path))
            return new String[] { filePattern };
        return new String[] { };
    }

    String dir = Path.GetFullPath(_dir);        // Make it absolute, just so we can extract relative path'es later on.
    String[] pattParts = filePattern.Replace("/", "\\").Split('\\');
    List<String> scanDirs = new List<string>();
    scanDirs.Add(dir);

    //
    //  By default glob pattern matching specifies "*" to any file / folder name, 
    //  which corresponds to any character except folder separator - in regex that's "[^\\]*"
    //  glob matching also allow double astrisk "**" which also recurses into subfolders. 
    //  We split here each part of match pattern and match it separately.
    //
    for (int iPatt = 0; iPatt < pattParts.Length; iPatt++)
    {
        bool bIsLast = iPatt == (pattParts.Length - 1);
        bool bRecurse = false;

        String regex1 = Regex.Escape(pattParts[iPatt]);         // Escape special regex control characters ("*" => "\*", "." => "\.")
        String pattern = Regex.Replace(regex1, @"\\\*(\\\*)?", delegate (Match m)
            {
                if (m.ToString().Length == 4)   // "**" => "\*\*" (escaped) - we need to recurse into sub-folders.
                {
                    bRecurse = true;
                    return ".*";
                }
                else
                    return @"[^\\]*";
            }).Replace(@"\?", ".");

        if (pattParts[iPatt] == "..")                           // Special kind of control, just to scan upper folder.
        {
            for (int i = 0; i < scanDirs.Count; i++)
                scanDirs[i] = scanDirs[i] + "\\..";

            continue;
        }

        Regex re = new Regex(pattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);
        int nScanItems = scanDirs.Count;
        for (int i = 0; i < nScanItems; i++)
        {
            String[] items;
            if (!bIsLast)
                items = Directory.GetDirectories(scanDirs[i], "*", (bRecurse) ? SearchOption.AllDirectories : SearchOption.TopDirectoryOnly);
            else
                items = Directory.GetFiles(scanDirs[i], "*", (bRecurse) ? SearchOption.AllDirectories : SearchOption.TopDirectoryOnly);

            foreach (String path in items)
            {
                String matchSubPath = path.Substring(scanDirs[i].Length + 1);
                if (re.Match(matchSubPath).Success)
                    scanDirs.Add(path);
            }
        }
        scanDirs.RemoveRange(0, nScanItems);    // Remove items what we have just scanned.
    } //for

    //  Make relative and return.
    return scanDirs.Select( x => x.Substring(dir.Length + 1) ).ToArray();
} //matchFiles

If you find any bugs, I'll be grad to fix them.

TarmoPikaro
  • 4,723
  • 2
  • 50
  • 62
0

I wrote a solution that does it. It does not depend on any library and it does not support "!" or "[]" operators. It supports the following search patterns:

C:\Logs\*.txt

C:\Logs\**\*P1?\**\asd*.pdf

    /// <summary>
    /// Finds files for the given glob path. It supports ** * and ? operators. It does not support !, [] or ![] operators
    /// </summary>
    /// <param name="path">the path</param>
    /// <returns>The files that match de glob</returns>
    private ICollection<FileInfo> FindFiles(string path)
    {
        List<FileInfo> result = new List<FileInfo>();
        //The name of the file can be any but the following chars '<','>',':','/','\','|','?','*','"'
        const string folderNameCharRegExp = @"[^\<\>:/\\\|\?\*" + "\"]";
        const string folderNameRegExp = folderNameCharRegExp + "+";
        //We obtain the file pattern
        string filePattern = Path.GetFileName(path);
        List<string> pathTokens = new List<string>(Path.GetDirectoryName(path).Split('\\', '/'));
        //We obtain the root path from where the rest of files will obtained 
        string rootPath = null;
        bool containsWildcardsInDirectories = false;
        for (int i = 0; i < pathTokens.Count; i++)
        {
            if (!pathTokens[i].Contains("*")
                && !pathTokens[i].Contains("?"))
            {
                if (rootPath != null)
                    rootPath += "\\" + pathTokens[i];
                else
                    rootPath = pathTokens[i];
                pathTokens.RemoveAt(0);
                i--;
            }
            else
            {
                containsWildcardsInDirectories = true;
                break;
            }
        }
        if (Directory.Exists(rootPath))
        {
            //We build the regular expression that the folders should match
            string regularExpression = rootPath.Replace("\\", "\\\\").Replace(":", "\\:").Replace(" ", "\\s");
            foreach (string pathToken in pathTokens)
            {
                if (pathToken == "**")
                {
                    regularExpression += string.Format(CultureInfo.InvariantCulture, @"(\\{0})*", folderNameRegExp);
                }
                else
                {
                    regularExpression += @"\\" + pathToken.Replace("*", folderNameCharRegExp + "*").Replace(" ", "\\s").Replace("?", folderNameCharRegExp);
                }
            }
            Regex globRegEx = new Regex(regularExpression, RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.IgnoreCase);
            string[] directories = Directory.GetDirectories(rootPath, "*", containsWildcardsInDirectories ? SearchOption.AllDirectories : SearchOption.TopDirectoryOnly);
            foreach (string directory in directories)
            {
                if (globRegEx.Matches(directory).Count > 0)
                {
                    DirectoryInfo directoryInfo = new DirectoryInfo(directory);
                    result.AddRange(directoryInfo.GetFiles(filePattern));
                }
            }

        }
        return result;
    }
Jon
  • 9
  • 4
0

Unfortunately the accepted answer will not handle escaped input correctly, because string .Replace("\*", ".*") fails to distinguish between "*" and "\*" - it will happily replace "*" in both of these strings, leading to incorrect results.

Instead, a basic tokenizer can be used to convert the glob path into a regex pattern, which can then be matched against a filename using Regex.Match. This is a more robust and flexible solution.

Here is a method to do this. It handles ?, *, and **, and surrounds each of these globs with a capture group, so the values of each glob can be inspected after the Regex has been matched.

static string GlobbedPathToRegex(ReadOnlySpan<char> pattern, ReadOnlySpan<char> dirSeparatorChars)
{
    StringBuilder builder = new StringBuilder();
    builder.Append('^');

    ReadOnlySpan<char> remainder = pattern;

    while (remainder.Length > 0)
    {
        int specialCharIndex = remainder.IndexOfAny('*', '?');

        if (specialCharIndex >= 0)
        {
            ReadOnlySpan<char> segment = remainder.Slice(0, specialCharIndex);

            if (segment.Length > 0)
            {
                string escapedSegment = Regex.Escape(segment.ToString());
                builder.Append(escapedSegment);
            }

            char currentCharacter = remainder[specialCharIndex];
            char nextCharacter = specialCharIndex < remainder.Length - 1 ? remainder[specialCharIndex + 1] : '\0';

            switch (currentCharacter)
            {
                case '*':
                    if (nextCharacter == '*')
                    {
                        // We have a ** glob expression
                        // Match any character, 0 or more times.
                        builder.Append("(.*)");

                        // Skip over **
                        remainder = remainder.Slice(specialCharIndex + 2);
                    }
                    else
                    {
                        // We have a * glob expression
                        // Match any character that isn't a dirSeparatorChar, 0 or more times.
                        if(dirSeparatorChars.Length > 0) {
                            builder.Append($"([^{Regex.Escape(dirSeparatorChars.ToString())}]*)");
                        }
                        else {
                            builder.Append("(.*)");
                        }

                        // Skip over *
                        remainder = remainder.Slice(specialCharIndex + 1);
                    }
                    break;
                case '?':
                    builder.Append("(.)"); // Regex equivalent of ?

                    // Skip over ?
                    remainder = remainder.Slice(specialCharIndex + 1);
                    break;
            }
        }
        else
        {
            // No more special characters, append the rest of the string
            string escapedSegment = Regex.Escape(remainder.ToString());
            builder.Append(escapedSegment);
            remainder = ReadOnlySpan<char>.Empty;
        }
    }

    builder.Append('$');

    return builder.ToString();
}

The to use it:

string testGlobPathInput = "/Hello/Test/Blah/**/test*123.fil?";
string globPathRegex = GlobbedPathToRegex(testGlobPathInput, "/"); // Could use "\\/" directory separator chars on Windows

Console.WriteLine($"Globbed path: {testGlobPathInput}");
Console.WriteLine($"Regex conversion: {globPathRegex}");

string testPath = "/Hello/Test/Blah/All/Hail/The/Hypnotoad/test_somestuff_123.file";
Console.WriteLine($"Test Path: {testPath}");
var regexGlobPathMatch = Regex.Match(testPath, globPathRegex);

Console.WriteLine($"Match: {regexGlobPathMatch.Success}");

for(int i = 0; i < regexGlobPathMatch.Groups.Count; i++) {
    Console.WriteLine($"Group [{i}]: {regexGlobPathMatch.Groups[i]}");
}

Output:

Globbed path: /Hello/Test/Blah/**/test*123.fil?
Regex conversion: ^/Hello/Test/Blah/(.*)/test([^/]*)123\.fil(.)$
Test Path: /Hello/Test/Blah/All/Hail/The/Hypnotoad/test_somestuff_123.file
Match: True
Group [0]: /Hello/Test/Blah/All/Hail/The/Hypnotoad/test_somestuff_123.file
Group [1]: All/Hail/The/Hypnotoad
Group [2]: _somestuff_
Group [3]: e

I have created a gist here as a canonical version of this method:

https://gist.github.com/crozone/9a10156a37c978e098e43d800c6141ad

Ryan
  • 1,670
  • 18
  • 25