3

How can I iterate a MatchCollection Index property of each Match most efficiently? I have many Regex objects in my code and I need to iterate through all Match indices but in the VS profiler I see that a simple Linq query

regex.Matches(text).Cast<Match>().Select(x => x.Groups[1].Index)

and the inside function:

IEnumerator.MoveNext()

takes almost half of the execution time. Is there some way to hardcode this? Maybe pointer jumps through internal structures or some other methods to avoid IEnumerable<T>?

participant
  • 2,923
  • 2
  • 23
  • 40
eocron
  • 63
  • 6

1 Answers1

2

As already noted by @L.B your Linq expression is subject to deferred execution. That is, if you iterate through your MatchCollection at each step your Regex will be executed to provide the next Match and that's most likely the performance hit you observe.

As a matter of fact, Regex are pretty heavy. But there are some tweaks you can make to improve the performance (seeRegexbest practices).

What you might want to try is to apply a compiled Regex:

Regex comp10 = new Regex(pattern, RegexOptions.Compiled);
Community
  • 1
  • 1
participant
  • 2,923
  • 2
  • 23
  • 40
  • thx, i forgot about deferred execution :) i added compiled option a long before uestion, and i think thats a bottleneck ...if only i had regexlike fast enough automaton indexe:( – eocron Nov 01 '14 at 09:58