1

I have a thousand of regexes that I try to match in efficient way.

if found this question, that purpose to create a big automata from all regexes.

I tried with this code:

regexEndpoints.FirstOrDefault(x => x.UrlPathRegex.IsMatch(urlPath))

But obviously it has very bad performance, specially when not any regex is matched, so the code have to check all the regexes.

My question is if how can I get better performance to run multiple regexes in c#?

Ygalbel
  • 5,214
  • 1
  • 24
  • 32
  • You have the [same](https://stackoverflow.com/questions/61724643/efficient-way-to-run-multiple-regexes-in-c-sharp) question already closed. – Guru Stron May 11 '20 at 08:10
  • 1
    In the first one I asked for a library, so they told me I have to ask it differently, that's what I did. – Ygalbel May 11 '20 at 08:11
  • As for question, depended on scenario - run in parallel, use [Compiled Regular Expressions](https://learn.microsoft.com/en-us/dotnet/standard/base-types/compilation-and-reuse-in-regular-expressions#compiled-regular-expressions), if it is still too slow, then I'm out of my depth =) – Guru Stron May 11 '20 at 08:18
  • I am pretty sure it's a way to create a big automate from all regex. Something like RETE algorithm. – Ygalbel May 11 '20 at 08:19
  • Merge the expressions? https://stackoverflow.com/a/32341513/468973 – Magnus May 11 '20 at 08:42
  • Thanks for answer, I tried it. It didn't had better performance. – Ygalbel May 11 '20 at 09:04
  • What are you trying to achieve? Are you wanting a yes/no of whether any of the regexs match? Do you need a count of how many of the regexes match? Do you need to know which one regex matches? Do you need any regex that matches but do not care which? Do you need to know all the regexes that match? I could keep on asking other variations. I believe that considering these sorts of questions will allow a useful answer to your question to be found. – AdrianHHH May 11 '20 at 18:42
  • I can stop after the first regex is matched – Ygalbel May 12 '20 at 06:56

1 Answers1

0

Maybe you can check it in parallel:

xxxxx found = null;
Parallel.ForEach(regexEndpoints, (x, state) =>
{
    if(x.UrlPathRegex.IsMatch(urlPath)){
        found = x;
        state.stop();
    }
});

if(found != null) {
    //do something
}

My c# is a bit rusty, but you get the idea.

Rumpelstinsk
  • 3,107
  • 3
  • 30
  • 57
  • Thanks for the answer, the problem is that the code is already heavily multi-tasked I will have better performance for a single request but it's won't be better on multiple request. – Ygalbel May 11 '20 at 08:22