0

I am trying to write a regular expression to match "ismU" flags. The requirements are as follows: 1) each character appears at most only once 2) The character can appear in any order: "is", "si", "mi", "smi", "Uims"

The requirement 1) lead to "?" quantifier, and 2) leads to "|" alternation.

"i?|U?|m?|s?" could only apply to length of 1.

"[imsU]{1,4}" could apply to length of 4 but it accepts duplicated flag(e.g., "ii")

Test cases to be True:[ "i", "im", "mi", "Ums", "iUsm"]; Test cases to be False:[ "I", "mm"].

Peipei
  • 136
  • 1
  • 9

1 Answers1

0

You can do it using a negative lookahead assertion in which you test if a character is repeated (anywhere). To express that, you need a capture group (.) and a reference to the capture \1.

^(?!.*(.).*\1)[imsU]+$

Note that you don't have to use a quantifier more precise than + since the lookahead has already check there isn't a same character twice and since there are only four different characters in the character class.

However, to be more efficient (in particular to avoid to test .*(.).*\1 on a long string), you can also write the pattern like this:

^(?!.{0,2}(.).{0,2}\1)[imsU]{1,4}$
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • To all wondering using negative lookahead in regex: https://stackoverflow.com/questions/32862316/negative-lookahead-with-capturing-groups This link shows the example of using capturing groups in the whole regular expression besides the lookahead. Here is mine for finishing my job: re.compile("(\(\?(?!.{0,2}(.).{0,2}\\2)[imsU]{1,4}\))\^(.*)") – Peipei Feb 08 '18 at 20:08