There are a lot of pattern types that can match empty strings. The OP regex belongs to an ^.*$
type, and it is easy to modify it to prevent empty string matching by replacing *
(= {0,}
) quantifier (meaning zero or more) with the +
(= {1,}
) quantifier (meaning one or more), as has already been mentioned in the posts here.
There are other pattern types matching empty strings, and it is not always obvious how to prevent them from matching empty strings.
Here are a few of those patterns with solutions:
[^"\\]*(?:\\.[^"\\]*)*
⇒ (?:[^"\\]|\\.)+
abc||def
⇒ abc|def
(remove the extra |
alternation operator)
^a*$
⇒ ^a+$
(+
matches 1 or more chars)
^(a)?(b)?(c)?$
⇒ ^(?!$)(a)?(b)?(c?)$
(the (?!$)
negative lookahead fails the match if end of string is at the start of the string)
or ⇒ ^(?=.)(a)?(b)?(c?)$
(the (?=.)
positive lookahead requires at least a single char, .
may match or not line break chars depending on modifiers/regex flavor)
^$|^abc$
⇒ ^abc$
(remove the ^$
alternative that enables a regex to match an empty string)
^(?:abc|def)?$
⇒ ^(?:abc|def)$
(remove the ?
quantifier that made the (?:abc|def)
group optional)
To make \b(?:north|south)?(?:east|west)?\b
(that matches north
, south
, east
, west
, northeast
, northwest
, southeast
, southwest
), the word boundaries must be precised: make the initial word boundary only match start of words by adding (?<!\w)
after it, and let the trailing word boundary only match at the end of words by adding (?!\w)
after it.
\b(?:north|south)?(?:east|west)?\b
⇒ \b(?<!\w)(?:north|south)?(?:east|west)?\b(?!\w)