1

OK regex nerds! I am using regex lookahead assertions for password validation that is similar to the pattern described here:

\A(?=\w{6,10}\z)(?=[^a-z]*[a-z])(?=(?:[^A-Z]*[A-Z]){3})(?=\D*\d)

However, we want to only require that any 3 of the 4 assertions be valid - not necessarily all of them. Any thoughts on how this could be done?

atjoedonahue
  • 476
  • 6
  • 19

3 Answers3

2

To shorten any kind of pattern, factorize:

\A(?:
    (?=\w{6,10}\z) (?=.*[a-z]) (?: (?:.*[A-Z]){3} | .*\d )
   |
    (?=.*\d) (?=(?:.*[A-Z]){3}) (?: .*[a-z] | \w{6,10}\z )
)

Note that you don't need a lookahead to test the last condition.

demo


Other way, where each condition is optional and that uses a named group to count (.net only):

\A
(?<c>(?=\w{6,10}\z))?
(?<c>(?=[^a-z]*[a-z]))?
(?<c>(?=(?:[^A-Z]*[A-Z]){3}))?
(?<c>(?=\D*\d))?
(?<-c>){3} # decrement c 3 times
(?(c)|(?!$)) # conditional: force the pattern to fail if too few conditions succeed.

demo

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
1

There's no "easy" way to do this in a single regular expression. The only way would be to define all possible permutations of the "three out of four" assertions - e.g.

\A(?=\w{6,10}\z)(?=[^a-z]*[a-z])(?=(?:[^A-Z]*[A-Z]){3})| # Maybe no digit
\A(?=[^a-z]*[a-z])(?=(?:[^A-Z]*[A-Z]){3})(?=\D*\d)| # Maybe wrong length
\A(?=\w{6,10}\z)(?=(?:[^A-Z]*[A-Z]){3})(?=\D*\d)| # Maybe no lower
\A(?=\w{6,10}\z)(?=[^a-z]*[a-z])(?=\D*\d) # Maybe not enough uppers

However, this mind-melting regex is clearly not a good solution.

A better approach would be to perform the four checks separately (with regex or otherwise), and count that there is at least three passed conditions.

...However, let's take a step back here and ask: Why are you doing this?? You're implementing a password entropy check. Based on your fuzzy rules, the following passwords are valid:

  • AAAa1
  • password1
  • LETmein

And the following passwords are invalid:

  • reallylongsecurepassword8374235359232
  • HorseBatteryStapleCorrect

I would strongly advise against such a bizarrely restrictive policy.

Tom Lord
  • 27,404
  • 4
  • 50
  • 77
  • I expected it could be accomplished with some kind of quantifier on the groups - I'm surprised that it can't be done without defining all permutations, regardless of the specific rules (which, again, is not my idea but just the requirements I've been handed). – atjoedonahue Dec 01 '17 at 14:32
  • Not if your logic is as simple as "does the regex match the string?". As I said, it would be *much* easier if you just add a little logic outside of a single regex match. – Tom Lord Dec 01 '17 at 14:33
  • Here's an important career tip: When you are handed stupid requirements, push back. Explain the flaw in the design. Tell them that `password1` would be deemed "secure", but `kjhKJFHjhjkhsgjkhjshgfkdjsfgFJKDHLWEKkdghagjdjdhfJHDFG` would be considered "insecure". Propose a better solution. – Tom Lord Dec 01 '17 at 14:35
  • 1
    To comfort you, the pattern itself is a correct password. – Casimir et Hippolyte Dec 01 '17 at 14:54
  • To be fair, I did push back and this is no longer a requirement. But the technical question on one or more assertions is still interesting to me. So much of this conversation has been on the specifics of the regex which are arbitrary. – atjoedonahue Dec 01 '17 at 14:54
  • @atjoedonahue Another way you *could* implement it would be to wrap each "assertion" in an optional capture group, and then simply count the number of groups in the result. E.g. something like: `"abc".match(/((?=.*a))?((?=.*b))?((?=.*c))?((?=.*d))?/).captures.compact.count # => 3` (But again, why do this? I can't see any reason to choose this over performing 4 separate checks.) – Tom Lord Dec 01 '17 at 15:05
1

Brief

The easiest method would be to have separate regular expressions and check whether 3/4 of them are successful in your code's language. The only way to do this in regex is to present all cases. That being said, this is probably the easiest method (in regex) to present all options as it allows you to edit the patterns in one location (where they are defined) rather than multiple times (more prone to bugs). The DEFINE constructs in regex are seldom supported, but PCRE regex does.

You can also have your code generate each regex permutation. See this question about generating all permutations of a list in python

I don't know why you want to do this for passwords, it's considered malpractice, but, since you're asking for it, I figured I'd give you the easiest solution possible in regex... You really should only check minimum length (and complexity if you want [based on algorithms] to show the user how secure your system finds their password to be).


Code

(?(DEFINE)
   (?<w>(?=\w{6,10}\z))
   (?<l>(?=[^a-z]*[a-z]))
   (?<u>(?=(?:[^A-Z]*[A-Z]){3}))
   (?<d>(?=\D*\d))
)
\A(?:
    (?&w)(?&l)(?&u)|
    (?&w)(?&l)(?&d)|
    (?&w)(?&u)(?&d)|
    (?&l)(?&u)(?&d)
)

Note: The regex above uses the x modifier (ignore whitespace) so that we can nicely organize the content.

ctwheels
  • 21,901
  • 9
  • 42
  • 77
  • I'm not familiar with the DEFINE construct. Can you point me to a resource about it? – atjoedonahue Dec 01 '17 at 15:14
  • @atjoedonahue I honestly don’t know if any good resources for the DEFINE construct. I’ve tried to find some in the past but haven’t had much luck. I know [regex101](https://regex101.com) has brief documentation about it though – ctwheels Dec 01 '17 at 15:28
  • @atjoedonahue [this article](https://www.regular-expressions.info/subroutine.html) has some information on it. – ctwheels Dec 01 '17 at 15:33