103

I would like to match strings with a wildcard (*), where the wildcard means "any". For example:

*X = string must end with X
X* = string must start with X
*X* = string must contain X

Also, some compound uses such as:

*X*YZ* = string contains X and contains YZ
X*YZ*P = string starts with X, contains YZ and ends with P.

Is there a simple algorithm to do this? I'm unsure about using regex (though it is a possibility).

To clarify, the users will type in the above to a filter box (as simple a filter as possible), I don't want them to have to write regular expressions themselves. So something I can easily transform from the above notation would be good.

Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
Robinson
  • 9,666
  • 16
  • 71
  • 115
  • Should `YZ ABC X` match `*X*YZ*`, i.e. do the substrings need to appear in the same order in both the string and the pattern or not? I'd assume it shouldn't match, but "string contains X and contains YZ" doesn't make it clear. If it should match, all the current answers are wrong. – Bernhard Barker May 18 '15 at 15:23
  • That would be a no. In the example given, X must appear before YZ. – Robinson May 19 '15 at 07:25

11 Answers11

191

Often, wild cards operate with two type of jokers:

  ? - any character  (one and only one)
  * - any characters (zero or more)

so you can easily convert these rules into appropriate regular expression:

// If you want to implement both "*" and "?"
private static String WildCardToRegular(String value) {
  return "^" + Regex.Escape(value).Replace("\\?", ".").Replace("\\*", ".*") + "$"; 
}

// If you want to implement "*" only
private static String WildCardToRegular(String value) {
  return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$"; 
}

And then you can use Regex as usual:

  String test = "Some Data X";

  Boolean endsWithEx = Regex.IsMatch(test, WildCardToRegular("*X"));
  Boolean startsWithS = Regex.IsMatch(test, WildCardToRegular("S*"));
  Boolean containsD = Regex.IsMatch(test, WildCardToRegular("*D*"));

  // Starts with S, ends with X, contains "me" and "a" (in that order) 
  Boolean complex = Regex.IsMatch(test, WildCardToRegular("S*me*a*X"));
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • It's not as easy as you claim. For example, one specialty is that when using `Directory.GetFiles`, a three letter extension `.htm` would also match `.html`, but a two letter extension `.ai` would not match `aix` or `aifg`. Windows wildcards are trivial on first sight, but under the hood, they're a bunch of grown legacy hypercomplex rulesets. – Sebastian Mach Jan 22 '18 at 14:27
  • 7
    @Sebastian Mach: Thank you for mentioning the nuance! I agree that MS DOS (and Windows) interpretation of the *wild cards* is different from standard one https://en.wikipedia.org/wiki/Wildcard_character However, the question is about strings and it doesn't mention files; that's why I've put the simplest solution assuming `*` being any characters (zero or more) and `?` being exactly one character . – Dmitry Bychenko Jan 22 '18 at 15:07
  • 2
    The original question was for string identifiers, not the filesystem, correct. – Robinson Jan 22 '18 at 21:20
  • 9
    If you worry about performance, [here's a C# implementation of a wildcard matching algorithm](https://bitbucket.org/hasullivan/fast-wildcard-matching) which is a lot faster than RegEx for this specific problem. – Dan Sep 05 '18 at 19:39
  • @Sebastian Mach... those are the 8.3 filenames that it matches. – Wouter Aug 06 '22 at 23:54
  • If you're going to use RegEx, you'll need to escape out special RegEx token characters first, such as periods, backlashes, etc. like so: "^" + Regex.Escape(pattern).Replace("\\\*", ".*").Replace("\\?", ".") + "$"; – dynamichael Jul 06 '23 at 12:58
  • @dynamichael: Yes, you are quite right, the escapement is mandatory in case of arbitrary `value`. However, `Replace` is a bad practice: there are more symbols that you mentioned, and the list of them is open (what if regex introduce more of them in future versions?). That's why I use `Regex.Escape(value)` and then `Replace` wildcards only `?` and `*`. – Dmitry Bychenko Jul 06 '23 at 13:12
38

You could use the VB.NET Like-Operator:

string text = "x is not the same as X and yz not the same as YZ";
bool contains = LikeOperator.LikeString(text,"*X*YZ*", Microsoft.VisualBasic.CompareMethod.Binary);  

Use CompareMethod.Text if you want to ignore the case.

You need to add using Microsoft.VisualBasic.CompilerServices; and add a reference to the Microsoft.VisualBasic.dll.

Since it's part of the .NET framework and will always be, it's not a problem to use this class.

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • hmm, adding "using" results in:`Type or namespace name 'CompilerServices' does not exist in namespace 'Microsoft.VisualBasic' (are you missing an assembly reference?` – dylanh724 May 31 '17 at 13:48
  • 3
    You need to add a reference to the Microsoft.VisualBasic.dll: https://stackoverflow.com/a/21212268/284240 – Tim Schmelter May 31 '17 at 14:21
  • 1
    It appears that this is no longer available in .Net 4.6. :( – Andrew Rondeau Mar 21 '18 at 18:59
  • @AndrewRondeau Are you sure? May have to update the correct answer in that case, i.e. I'm guessing right now it's a bug waiting to happen for me. – Robinson Jul 17 '18 at 11:22
  • 1
    I'm using 4.7 and it works fine. There is a note on the website saying it's not supported in .NET Core and .NET Standard projects though. – VoteCoffee Jan 20 '20 at 21:46
  • 5
    It is now supported in .NET Core, version 3.0 onwards: https://learn.microsoft.com/en-us/dotnet/api/microsoft.visualbasic.compilerservices.likeoperator.likestring?view=netcore-3.1#moniker-applies-to – Holf Jul 05 '20 at 10:15
28

For those using .NET Core 2.1+ or .NET 5+, you can use the FileSystemName.MatchesSimpleExpression method in the System.IO.Enumeration namespace.

string text = "X is a string with ZY in the middle and at the end is P";
bool isMatch = FileSystemName.MatchesSimpleExpression("X*ZY*P", text);

Both parameters are actually ReadOnlySpan<char> but you can use string arguments too. There's also an overloaded method if you want to turn on/off case matching. It is case insensitive by default as Chris mentioned in the comments.

Jamie Lester
  • 848
  • 9
  • 20
20

Using of WildcardPattern from System.Management.Automation may be an option.

pattern = new WildcardPattern(patternString);
pattern.IsMatch(stringToMatch);

Visual Studio UI may not allow you to add System.Management.Automation assembly to References of your project. Feel free to add it manually, as described here.

Community
  • 1
  • 1
VirtualVDX
  • 2,231
  • 1
  • 13
  • 14
  • 1
    Although this is a great solution, unfortunately `WildCardPattern` is not supported by .NET Core apps (it is up to .NET Standard 2.1 / recent Framework though). Further, some user report that [`System.Management.Automation` is not meant to be used directly](https://stackoverflow.com/a/66607812/3873799). I have fallen into the trap of relying on this answer (which works great!) but now I am required to upgrade to .NET Core and I found myself out of luck. – alelom Nov 10 '21 at 19:04
  • I will try [this answer](https://stackoverflow.com/a/66465594/3873799) for an alternative. – alelom Nov 10 '21 at 19:07
7

A wildcard * can be translated as .* or .*? regex pattern.

You might need to use a singleline mode to match newline symbols, and in this case, you can use (?s) as part of the regex pattern.

You can set it for the whole or part of the pattern:

X* = > @"X(?s:.*)"
*X = > @"(?s:.*)X"
*X* = > @"(?s).*X.*"
*X*YZ* = > @"(?s).*X.*YZ.*"
X*YZ*P = > @"(?s:X.*YZ.*P)"
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
5

*X*YZ* = string contains X and contains YZ

@".*X.*YZ"

X*YZ*P = string starts with X, contains YZ and ends with P.

@"^X.*YZ.*P$"
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
4

It is necessary to take into consideration, that Regex IsMatch gives true with XYZ, when checking match with Y*. To avoid it, I use "^" anchor

isMatch(str1, "^" + str2.Replace("*", ".*?"));  

So, full code to solve your problem is

    bool isMatchStr(string str1, string str2)
    {
        string s1 = str1.Replace("*", ".*?");
        string s2 = str2.Replace("*", ".*?");
        bool r1 = Regex.IsMatch(s1, "^" + s2);
        bool r2 = Regex.IsMatch(s2, "^" + s1);
        return r1 || r2;
    }
  • 2
    Welcome to Stack Overflow! While you may have solved the asker's problem, code-only answers are not very helpful to others who come across this question. Please edit your answer to explain why your code solves the original problem. – Joe C Jan 22 '17 at 22:50
  • This solution would work if you're simply matching alpanumeric characters and a few others, but it'd fail if you were trying to match any other character that defines the syntax of the regular expression, for example, "*/*" or "*[*" just as a couple examples. – jimmyfever Jan 22 '19 at 13:59
  • Besides, you should also add $ to the end, so that *.ab does not match foo.abc, and escape the . character itself (and whatever other regular expression characters you may want to use). And why do you first match s1 against s2 and then s2 against s1? Make one parameter the pattern, and the other parameter the matched string. – Mike Rosoft Sep 06 '21 at 06:52
  • Also, why on Earth do you replace `*` with `.*?`? – Mike Rosoft Sep 06 '21 at 07:49
2

This is kind of an improvement on the popular answer from @Dmitry Bychenko above (https://stackoverflow.com/a/30300521/4491768). In order to support ? and * as a matching characters we have to escape them. Use \\? or \\* to escape them.

Also a pre compiled regex will improve the performance (on reuse).

public class WildcardPattern
{
    private readonly string _expression;
    private readonly Regex _regex;

    public WildcardPattern(string pattern)
    {
        if (string.IsNullOrEmpty(pattern)) throw new ArgumentNullException(nameof(pattern));
       
        _expression = "^" + Regex.Escape(pattern)
            .Replace("\\\\\\?","??").Replace("\\?", ".").Replace("??","\\?")
            .Replace("\\\\\\*","**").Replace("\\*", ".*").Replace("**","\\*") + "$";
        _regex = new Regex(_expression, RegexOptions.Compiled);
    }

    public bool IsMatch(string value)
    {
        return _regex.IsMatch(value);
    }
}

usage

new WildcardPattern("Hello *\\**\\?").IsMatch("Hello W*rld?");
new WildcardPattern(@"Hello *\**\?").IsMatch("Hello W*rld?");
Wouter
  • 2,540
  • 19
  • 31
0

To support those one with C#+Excel (for partial known WS name) but not only - here's my code with wildcard (ddd*). Briefly: the code gets all WS names and if today's weekday(ddd) matches the first 3 letters of WS name (bool=true) then it turn it to string that gets extracted out of the loop.

using System;
using Microsoft.Office.Interop.Excel;
using System.Runtime.InteropServices;
using Range = Microsoft.Office.Interop.Excel.Range;
using System.Diagnostics;
using System.Reflection;
using System.IO;
using System.Text.RegularExpressions;

...
string weekDay = DateTime.Now.ToString("ddd*");

Workbook sourceWorkbook4 = xlApp.Workbooks.Open(LrsIdWorkbook, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
Workbook destinationWorkbook = xlApp.Workbooks.Open(masterWB, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);

            static String WildCardToRegular(String value)
            {
                return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$";
            }

            string wsName = null;
            foreach (Worksheet works in sourceWorkbook4.Worksheets)
            {
                Boolean startsWithddd = Regex.IsMatch(works.Name, WildCardToRegular(weekDay + "*"));

                    if (startsWithddd == true)
                    {
                        wsName = works.Name.ToString();
                    }
            }

            Worksheet sourceWorksheet4 = (Worksheet)sourceWorkbook4.Worksheets.get_Item(wsName);

...
ZIELIK
  • 11
  • 5
-1
public class Wildcard
{
    private readonly string _pattern;

    public Wildcard(string pattern)
    {
        _pattern = pattern;
    }

    public static bool Match(string value, string pattern)
    {
        int start = -1;
        int end = -1;
        return Match(value, pattern, ref start, ref end);
    }

    public static bool Match(string value, string pattern, char[] toLowerTable)
    {
        int start = -1;
        int end = -1;
        return Match(value, pattern, ref start, ref end, toLowerTable);
    }

    public static bool Match(string value, string pattern, ref int start, ref int end)
    {
        return new Wildcard(pattern).IsMatch(value, ref start, ref end);
    }

    public static bool Match(string value, string pattern, ref int start, ref int end, char[] toLowerTable)
    {
        return new Wildcard(pattern).IsMatch(value, ref start, ref end, toLowerTable);
    }

    public bool IsMatch(string str)
    {
        int start = -1;
        int end = -1;
        return IsMatch(str, ref start, ref end);
    }

    public bool IsMatch(string str, char[] toLowerTable)
    {
        int start = -1;
        int end = -1;
        return IsMatch(str, ref start, ref end, toLowerTable);
    }

    public bool IsMatch(string str, ref int start, ref int end)
    {
        if (_pattern.Length == 0) return false;
        int pindex = 0;
        int sindex = 0;
        int pattern_len = _pattern.Length;
        int str_len = str.Length;
        start = -1;
        while (true)
        {
            bool star = false;
            if (_pattern[pindex] == '*')
            {
                star = true;
                do
                {
                    pindex++;
                }
                while (pindex < pattern_len && _pattern[pindex] == '*');
            }
            end = sindex;
            int i;
            while (true)
            {
                int si = 0;
                bool breakLoops = false;
                for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
                {
                    si = sindex + i;
                    if (si == str_len)
                    {
                        return false;
                    }
                    if (str[si] == _pattern[pindex + i])
                    {
                        continue;
                    }
                    if (si == str_len)
                    {
                        return false;
                    }
                    if (_pattern[pindex + i] == '?' && str[si] != '.')
                    {
                        continue;
                    }
                    breakLoops = true;
                    break;
                }
                if (breakLoops)
                {
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                    if (si == str_len)
                    {
                        return false;
                    }
                }
                else
                {
                    if (start == -1)
                    {
                        start = sindex;
                    }
                    if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
                    {
                        break;
                    }
                    if (sindex + i == str_len)
                    {
                        if (end <= start)
                        {
                            end = str_len;
                        }
                        return true;
                    }
                    if (i != 0 && _pattern[pindex + i - 1] == '*')
                    {
                        return true;
                    }
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                }
            }
            sindex += i;
            pindex += i;
            if (start == -1)
            {
                start = sindex;
            }
        }
    }

    public bool IsMatch(string str, ref int start, ref int end, char[] toLowerTable)
    {
        if (_pattern.Length == 0) return false;

        int pindex = 0;
        int sindex = 0;
        int pattern_len = _pattern.Length;
        int str_len = str.Length;
        start = -1;
        while (true)
        {
            bool star = false;
            if (_pattern[pindex] == '*')
            {
                star = true;
                do
                {
                    pindex++;
                }
                while (pindex < pattern_len && _pattern[pindex] == '*');
            }
            end = sindex;
            int i;
            while (true)
            {
                int si = 0;
                bool breakLoops = false;

                for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
                {
                    si = sindex + i;
                    if (si == str_len)
                    {
                        return false;
                    }
                    char c = toLowerTable[str[si]];
                    if (c == _pattern[pindex + i])
                    {
                        continue;
                    }
                    if (si == str_len)
                    {
                        return false;
                    }
                    if (_pattern[pindex + i] == '?' && c != '.')
                    {
                        continue;
                    }
                    breakLoops = true;
                    break;
                }
                if (breakLoops)
                {
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                    if (si == str_len)
                    {
                        return false;
                    }
                }
                else
                {
                    if (start == -1)
                    {
                        start = sindex;
                    }
                    if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
                    {
                        break;
                    }
                    if (sindex + i == str_len)
                    {
                        if (end <= start)
                        {
                            end = str_len;
                        }
                        return true;
                    }
                    if (i != 0 && _pattern[pindex + i - 1] == '*')
                    {
                        return true;
                    }
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                    continue;
                }
            }
            sindex += i;
            pindex += i;
            if (start == -1)
            {
                start = sindex;
            }
        }
    }
}
nb.duong
  • 1
  • 1
  • 1
    A code only answer is not very useful. Giving a big piece of code with no explanation of what it does, or why it answers the question is not helpful to anyone. Writing identical code with no explanations on two different questions is not helpful. – AdrianHHH Jan 22 '21 at 12:07
-4

C# Console application sample

Command line Sample:
C:/> App_Exe -Opy PythonFile.py 1 2 3
Console output:
Argument list: -Opy PythonFile.py 1 2 3
Found python filename: PythonFile.py

using System;
using System.Text.RegularExpressions;           //Regex

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            string cmdLine = String.Join(" ", args);

            bool bFileExtFlag = false;
            int argIndex = 0;
            Regex regex;
            foreach (string s in args)
            {
                //Search for the 1st occurrence of the "*.py" pattern
                regex = new Regex(@"(?s:.*)\056py", RegexOptions.IgnoreCase);
                bFileExtFlag = regex.IsMatch(s);
                if (bFileExtFlag == true)
                    break;
                argIndex++;
            };

            Console.WriteLine("Argument list: " + cmdLine);
            if (bFileExtFlag == true)
                Console.WriteLine("Found python filename: " + args[argIndex]);
            else
                Console.WriteLine("Python file with extension <.py> not found!");
        }


    }
}
geo le
  • 7
  • 2
  • 1
    So you solve an issue with an external application? do you realize how many unrequired resources are wasted? – NucS Aug 17 '17 at 08:31
  • @NucS I think we're supposed to analyse the code and figure out what's useful. Anyway I fail to see what this brings over other answers. – Jerther Feb 02 '18 at 15:52