0

I have the regex here: ^([a-zA-Z]+[’',.\-]?[a-zA-Z ]*)+[ ]([a-zA-Z]+[’',.\-]?[a-zA-Z ]+)+$

When I run the code bellow:

Pattern namePattern = Pattern.compile("^([a-zA-Z]+[’',.\\-]?[a-zA-Z ]*)+[ ]([a-zA-Z]+[’',.\\-]?[a-zA-Z ]+)+$");
Matcher namelMatcher = namePattern.matcher("hau hauhahahahahjdjdj);

And lost very long time to complete. Why the regex match slow? Any suggestion how to improve this?

TrungHau
  • 27
  • 1
  • 8

1 Answers1

2

I'd suggest taking a look at https://en.wikipedia.org/wiki/ReDoS#Evil_regexes

Your regex contains several repeated patterns:

([a-zA-Z]+[’',.\-]?[a-zA-Z ]*)+ 

and

([a-zA-Z]+[’',.\-]?[a-zA-Z ]+)+$

Just as a quick example of how this might slow it down, take a look at the processing time and steps on these examples: a few characters versus having even more characters at the end and even worse, that set repeated many times

To fix this, you should try narrowing down your regular expressions a bit depending on what you're actually trying to grab, and remove some of the recursion in them. Without knowing more about your desired input/output it's kind of hard to guess what you want, but I'd wager something like this would accomplish the same thing faster:

^([a-zA-Z’',.\-]+) ([a-zA-Z’',.\-]+)$

or more inclusively

^([^ ]+) ([^ ]+)$

Another good reference

Artichoke
  • 94
  • 1
  • 4
  • Thank you, you're right. I have updated my regex : `^([a-zA-Z]+[\'\,\.\-]+[a-zA-Z ]+) ([a-zA-Z]+[\'\,\.\-]+[a-zA-Z ]+)$` and faster – TrungHau Jul 27 '18 at 17:35