5

Is there a regex flavor that allows me to count the number of repetitions matched by the * and + operators? I'd specifically like to know if it's possible under the .NET Platform.

polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
luvieere
  • 37,065
  • 18
  • 127
  • 179

3 Answers3

10

You're fortunate because in fact .NET regex does this (which I think is quite unique). Essentially in every Match, each Group stores every Captures that was made.

So you can count how many times a repeatable pattern matched an input by:

  • Making it a capturing group
  • Counting how many captures were made by that group in each match
    • You can iterate through individual capture too if you want!

Here's an example:

Regex r = new Regex(@"\b(hu?a)+\b");

var text = "hahahaha that's funny but not huahuahua more like huahahahuaha";
foreach (Match m in r.Matches(text)) {
   Console.WriteLine(m + " " + m.Groups[1].Captures.Count);
}

This prints (as seen on ideone.com):

hahahaha 4
huahuahua 3
huahahahuaha 5

API references

polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
  • See also http://stackoverflow.com/questions/2250335/differences-among-net-capture-group-match and http://stackoverflow.com/questions/3320823/whats-the-difference-between-groups-and-captures-in-net-regular-expressions – polygenelubricants Jul 26 '10 at 08:13
3

You can use parentheses in the expression to create a group and then use the + or * operator on the group. The Captures property of the Group can be used to determine how many times it was matched. The following example counts the number of consecutive lower-case letters at the start of a string:

var regex = new Regex(@"^([a-z])+");
var match = regex.Match("abc def");

if (match.Success)
{
    Console.WriteLine(match.Groups[1].Captures.Count);
}
Phil Ross
  • 25,590
  • 9
  • 67
  • 77
0

how about taking "pref ([a-z]+) suff"

then use groups to capture that [a-z]+ in the bracket and find its length?

You can use this length for subsequent matching as well.

Lie Ryan
  • 62,238
  • 13
  • 100
  • 144
  • not applicable to repetition of general pattern (see my answer for example), but obviously if the pattern matches exactly one character this would work – polygenelubricants Jun 12 '10 at 16:00