11

Is there a way to get single regex to satisfy this condition??

I am looking for a "word" that has three letters from the set MBIPI, any order, but MUST contain an I.

ie.

re.match("[MBDPI]{3}", foo) and "I" in foo

So this is the correct result (in python using the re module), but can I get this from a single regex?

>>> for foo in ("MBI", "MIB", "BIM", "BMI", "IBM", "IMB", "MBD"):
...     print foo,
...     print re.match("[MBDPI]{3}", foo) and "I" in foo
MBI True
MIB True
BIM True
BMI True
IBM True
IMB True
MBD False

with regex I know I can use | as a boolean OR operator, but is there a boolean AND equivalent?

or maybe I need some forward or backward lookup?

user213043
  • 161
  • 1
  • 1
  • 6
  • You can also search for the character 'I' with str.find(). Source: http://docs.python.org/library/stdtypes.html#str.find – Dor Mar 05 '10 at 09:59

4 Answers4

5

You can fake boolean AND by using lookaheads. According to http://www.regular-expressions.info/lookaround2.html, this will work for your case:

"\b(?=[MBDPI]{3}\b)\w*I\w*"
Jens
  • 25,229
  • 9
  • 75
  • 117
4

with regex I know I can use | as a boolean OR operator, but is there a boolean AND equivalent?

A and B = not ( not A or not B) = (?![^A]|[^B])

A and B being expressions that actually may have members in common.

Jan Heldal
  • 148
  • 6
3

Or is about the only thing you can do:

\b(I[MBDPI]{2}|[MBDPI]I[MBDPI]|[MBDPI]{2}I)\b

The \b character matches a zero-width word boundary. This ensures you match something that is exactly three characters long.

You're otherwise running into the limits to what a regular language can do.

An alternative is to match:

\b[MBDPI]{3}\b

capture that group and then look for an I.

Edit: for the sake of having a complete answer, I'll adapt Jens' answer that uses Testing The Same Part of a String for More Than One Requirement:

\b(?=[MBDPI]{3}\b)\w*I\w*

with the word boundary checks to ensure it's only three characters long.

This is a bit more of an advanced solution and applicable in more situations but I'd generally favour what's easier to read (being the "or" version imho).

Community
  • 1
  • 1
cletus
  • 616,129
  • 168
  • 910
  • 942
  • OMFG!! thank you, the easier to read version doesnt work for me, BUT, your general example and the link to lookaround saved my brains!! while much harder to READ, it is the only method when the conditional clause becomes much more complex. My use case (?=(pattern1|pattern2)) (pattern1)? (pattern2)? pattern3 Ugly to read, but the only parsimonious solution. – Peter Cibulskis Mar 25 '22 at 19:24
2

You could use lookahead to see if an I is present:

(?=[MBDPI]{0,2}I)[MBDPI]{3}
Bart Kiers
  • 166,582
  • 36
  • 299
  • 288