Why are you matching whitespace separately from the other characters? And why are you anchoring the match at the beginning, but not at the end? If you want to make sure the string doesn't start or end with whitespace, you should do something like this:
^[A-Za-z0-9_.&,-]+(?:\s+[A-Za-z0-9_.&,-]+)*$
Now there's only one "path" the regex engine can take through the string. If it runs out of characters that match [A-Za-z0-9_.&,-]
before reaching the end, and the next character doesn't match \s
, it fails immediately. If it reaches the end while still matching whitespace characters, it fails because it's required to match at least one non-whitespace character after each run of whitespace.
If you want to make sure there's exactly one whitespace character separating the runs of non-whitespace, just remove the quantifier from \s+
:
^[A-Za-z0-9_.&,-]+(?:\s[A-Za-z0-9_.&,-]+)*$
If you don't care where the whitespace is in relation to the non-whitespace, just match them all with the same character class:
^[A-Za-z0-9_.&,\s-]+$
I'm assuming you know that your regex won't match the given input because of the :
and (
in the smiley, and you just want to know why it takes so long to fail.
And of course, since you're creating the regex in the form of a Java string literal, you would write:
"^[A-Za-z0-9_.&,-]+(?:\\s+[A-Za-z0-9_.&,-]+)*$"
or
"^[A-Za-z0-9_.&,-]+(?:\\s[A-Za-z0-9_.&,-]+)*$"
or
"^[A-Za-z0-9_.&,\\s-]+$"
(I know you had double backslashes in the original question, but that was probably just to get them to display properly, since you weren't using SO's excellent code formatting feature.)