How can I match letters a,b,c once in any combination and varying length like this:
The expression should match these cases:
abc
bc
a
b
bca
but should not match these ones:
abz
aab
cc
x
How can I match letters a,b,c once in any combination and varying length like this:
The expression should match these cases:
abc
bc
a
b
bca
but should not match these ones:
abz
aab
cc
x
Use regex pattern
\b(?!\w*(\w)\w*\1)[abc]+\b
You can use this pattern with any set and size, just replace [abc]
with desired set...
(above output is from myregextester)
^(?=([^a]*a?[^a]*)$)(?=([^b]*b?[^b]*)$)(?=([^c]*c?[^c]*)$)[abc]{1,3}$
This works with lookaheads.
It includes this pattern in three variations: (?=([^a]*a?[^a]*)$)
It says: There needs to be at most one a
from here (the beginning) until the end.
Combining lookaheads and backreferences:
^([abc])((?!\1)([abc])((?!\1)(?!\3)[abc])?)?$
Just to round out the collection:
^(?:([abc])(?!.*\1))+$
Want to handle a larger set of characters? No problem:
^(?:([abcdefgh])(?!.*\1))+$
EDIT: Apparently I misread the question; you're not validating individual strings like "abc"
and "ba"
, you're trying to find whole-word matches in a larger string. Here's how I would do that:
\b(?:([abc])(?![abc]*\1))+\b
The tricky part is making sure the lookahead doesn't look beyond the end of the word that's currently being matched. For example, if I had left the lookahead as (?!.*\1)
, it would fail to match the abc
in abc za
because the lookahead would incorrectly flag the a
in za
as a duplicate of the a
in abc
. Allowing the lookahead to look only at valid characters ([abc]*
) keeps it on a sufficiently short leash. And if there are invalid characters in the current word, it's not the lookahead's job to spot them anyway.
(Thanks to Honest Abe for bringing this back to my attention.)
^(?=(.*a.*)?$)(?=(.*b.*)?$)(?=(.*c.*)?$)[abc]{,3}$
The anchored look-aheads limit the number of occurrences of each letter to one.
I linked it in comment (this is sort of a dupe of How can I find repeated characters with a regex in Java?).. but to be more specific.. the regex:
(\w)\1+
Will match any two or more of the same character. Negate that and you have your regex.