22

I've been unable to find an answer on this: can I use the Regex.Matches method to return only the contents of items with curly braces?

If I use the Regex ({[^}]*}) my MatchCollection values includes the braces. I want to match, but then only return the contents. Here's what I have so far:

Regex regex = new Regex(({[^}]*}), RegexOptions.IgnoreCase);
MatchCollection matches = regex.Matches("Test {Token1} {Token 2}");
// Results include braces (undesirable)
var results = matches.Cast<Match>().Select(m => m.Value).Distinct().ToList();
gotqn
  • 42,737
  • 46
  • 157
  • 243
PeterX
  • 2,713
  • 3
  • 32
  • 42

8 Answers8

34

I always liked it explicit. So you can use "positive lookbehind" (?<=...) and "positive lookahead" (?=...) groups:

(?<=\{)
[^}]*
(?=\})

which means:

  • require opening curly bracket before match
  • collect text (of, course) - as commented before I may be [^{}]* as well
  • require closing curly bracket after match
Milosz Krajewski
  • 1,160
  • 1
  • 12
  • 19
16

In C#, as in many other programming language, the regex engine supports capturing groups, that are submatches, parts of substrings that match a whole regex pattern, defined in a regex pattern with the help of parentheses (e.g. 1([0-9])3 will match 123 and save the value of 2 into a capture group 1 buffer). Captured texts are accessed via Match.Groups[n].Value where n is the index of the capture group inside the pattern.

Capturing is much more effecient that lookarounds. Whenever there is no need for complex conditions, capturing groups are much better alternatives.

See my regex speed test performed at regexhero.net:

enter image description here

Now, how can we get the substring inside curly braces?

  • if there is no other curly braces inside, with a negated character class: {([^{}]*)
  • if there can be nested curly brackets: {((?>[^{}]+|{(?<c>)|}(?<-c>))*(?(c)(?!)))

In both cases, we match an opening {, and then match (1) any character other than { or }, or (2) any characters up to the first paired }.

Here is sample code:

var matches = Regex.Matches("Test {Token1} {Token 2}", @"{([^{}]*)");
var results = matches.Cast<Match>().Select(m => m.Groups[1].Value).Distinct().ToList();
Console.WriteLine(String.Join(", ", results));
matches = Regex.Matches("Test {Token1} {Token {2}}", @"{((?>[^{}]+|{(?<c>)|}(?<-c>))*(?(c)(?!)))");
results = matches.Cast<Match>().Select(m => m.Groups[1].Value).Distinct().ToList();
Console.WriteLine(String.Join(", ", results));

Result: Token1, Token 2, Token1, Token {2}.

Note that RegexOptions.IgnoreCase is redundant when you have no literal letters that can have different case in the pattern.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
5

Thanks Milosz Krajewski, Nothing to add but here is the function

private List<String> GetTokens(String str)
{
    Regex regex = new Regex(@"(?<=\{)[^}]*(?=\})", RegexOptions.IgnoreCase);
    MatchCollection matches = regex.Matches(str);

    // Results include braces (undesirable)
    return matches.Cast<Match>().Select(m => m.Value).Distinct().ToList();
}
bunjeeb
  • 1,096
  • 1
  • 17
  • 32
  • Thanks for this, but maybe this should be merged with [@Milosz Krajewski's answer](https://stackoverflow.com/a/16538131/5876282) – B Charles H Dec 17 '19 at 14:05
3

Just move the braces outside the parentheses:

 {([^}]*)}
RichieHindle
  • 272,464
  • 47
  • 358
  • 399
  • Should the set `[^}]` be modified to to `[^{}]`? Right now the regular expression also matches `{{{Hello}`, doesn't it? – Dirk May 14 '13 at 08:02
  • The `regex.Matches` method still returns the values with braces. – PeterX May 14 '13 at 23:32
  • @PeterX: You need to look at the `Captures` of the `Matches`. The captures contain the pieces between the parentheses. – RichieHindle May 15 '13 at 08:36
1

It is regex for C# .net.

@"{(.*?)}"

it display a

token1 token2

LeftyX
  • 35,328
  • 21
  • 132
  • 193
  • This results in a `match.Value` containing **also** the curly brackets. Which is exactly what the author of the question would like to avoid – garlix Apr 04 '18 at 08:49
1

Little bit modifying the answer of @Milosz Krajewski

(?<=\{)[^}{]*(?=\})

this will skip the middle single opening and closing Curly braces in the string.

sumit sharma
  • 127
  • 1
  • 11
0

If I understand what you want. Change the regex to {([^}]*)}. That will only capture the text between {}, not including them.

UberMouse
  • 917
  • 2
  • 12
  • 26
0

Thanks all for the regex tips! I know that this is not the answer to the original question but in case it helps someone else, I created this string extension method based on all your recommendations so I can replace localized string constants.

/// <summary>
/// Replace all the text in curly brackets with the parameters by
/// order it apears in the text.
/// </summary>
/// <remarks>This is to be use with const string that cannot be 
/// interpolated with $ or String.Format.</remarks>
/// <param name="text">The text that contains string in curly 
/// brackets.</param>
/// <param name="replaceTexts">The list of replace texts ordered as 
/// it apear in the <paramref name="text"/></param>
/// <returns>The interpolated text where string in curly brackets 
/// are replaced with the replaceTexts parameters.</returns>
public static string ReplaceText(this string text, params string[] replaceTexts)
{
    if (string.IsNullOrEmpty(text)) return text;
    
    // Found all {TextToReplace} and results includes curlybraket 
    // so we can use string.Replace
    var matches = Regex.Matches(text, @"{(.*?)}");            
    var results = matches.Cast<Match>()
                         .Select(m => m.Value).Distinct().ToList();
    
    // Nothing to replace in the text, just return it
    if(!results.Any()) return text;
    
    // The number of element to replace must match 
    // the replaceTexts parameter
    if(results.Count() != replaceTexts.Count())
    {
        throw new ArgumentOutOfRangeException(nameof(replaceTexts), "The text must conaints the exact same number of curly brackets string to replace than the replaceTexts parameters.");
    }

    var index = 0;
    foreach (var result in results)
    {
        text = text.Replace(result, replaceTexts[index++]);
    }

    return text;
}

Usage:

public const string MyConstantString = "Replace {thisText} with {otherText}";
MyConstantString.ReplaceText("first parameter", "second parameter");

The above result will be "Replace first parameter with second parameter".

SteveL
  • 61
  • 5
  • If you have a new question, please ask it by clicking the [Ask Question](https://stackoverflow.com/questions/ask) button. Include a link to this question if it helps provide context. - [From Review](/review/late-answers/31481942) – John V Apr 10 '22 at 02:55