-1

I am trying to write a single regex which matches strings

"A","B","AB"

and which does not match

"","AA","BB","BA"

I tried a simple pattern

re.search(r'^(A)?(B)?',sample_str)

But this pattern matches "". I know many solutions which can match this by performing logical operations on multiple patterns but is it possible to match using a single pattern?

Thank you in advance.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Akarsh J D
  • 16
  • 3
  • I can't figure out the pattern, for ex: why "AB" is OK while "BA" is not. Is it like you construct a language out of "A" and "B" in order? – Curcuma_ Apr 27 '18 at 15:04
  • if it only accepts these values, why don't you just compare with values ? – Curcuma_ Apr 27 '18 at 15:06
  • You can just `re.search(r'A|B|AB', sample_str)`, but if you want to find substrings matching that the problem is kind of ill-posed, giving `"AB"` could be interpreted as two matches, `A` and `B`, or one match `AB`. – jdehesa Apr 27 '18 at 15:07
  • @Curcuma_ I cannot compare the values because A and B are two blocks of strings that include other characters. I was trying to generalize the question. – Akarsh J D Apr 27 '18 at 15:12
  • Please don't generalize it to the point the question doesn't make sense. Give a complete example of what pattern you would like to match. – user3483203 Apr 27 '18 at 15:21
  • @jdehesa Consider `A` and 'B' themselves as big blocks of strings. I was trying to avoid using multiple logical operations. – Akarsh J D Apr 27 '18 at 15:23
  • Try `re.search(r'^(?!$)(A)?(B)?$',sample_str)` – Wiktor Stribiżew Apr 27 '18 at 17:19

3 Answers3

0

Try this out. Modified some regex from this answer

regex statement = ^((?=A)A(B?)|(?=B)B)(?!.*(\w)\1)$

__

d = ['A', 'B', 'AB', '', 'AA', 'BB', 'BA']
for n in d:
     t = re.search('^((?=A)A(B?)|(?=B)B)(?!.*(\w)\1)$', n)
     print(t)

<_sre.SRE_Match object; span=(0, 1), match='A'>
<_sre.SRE_Match object; span=(0, 1), match='B'>
<_sre.SRE_Match object; span=(0, 2), match='AB'>
None
None
None
None
W Stokvis
  • 1,409
  • 8
  • 15
0

It seems the anchor(^) needs to be removed like this,

(A)?(B)?

Demo

If you want match A or B or AB, then recommend you try to use the expression, AB|A|B Demo

Thm Lee
  • 1,236
  • 1
  • 9
  • 12
0

You mentioned that the problem with the current solution is that it matches empty string. You also mentioned that you do not want to repeat writing A and B, which is satisfied by current solution.

In that case, you can just exclude empty match by positive lookahead, ensuring at least one character match:

re.search(r'^(?=.)(A)?(B)?$',sample_str)

I also added the end-of-string anchor $ to ensure that trailing characters are not allowed (your AA and BB examples).

Check it in regex101.com

Of course, this assumes that whatever regex A and B would be, they do not match empty string.

(and I just noticed that a similar method has been proposed by Wiktor Stribizew in the comments)

justhalf
  • 8,960
  • 3
  • 47
  • 74