1

I am attempting to match strings that would have a pattern of:

  • two uppercase Latin letters
  • two digits
  • two uppercase Latin letters
  • four digits
  • ex: MH 45 LE 4098

There can be optional whitespaces between the first three and they need to be limited to these numbers of characters. I was trying to group them and set a limit on the characters, but I am not matching any strings that fall within the define parameters. I had attempted building a set like so template = '[A-Z{2}0-9{2,4}]', but was still receiving errors when the last digits had exceeded 4.

template = '(A-Z{2})\s?(\d{2})\s?(A-Z{2})\s?(\d{4})'

This was the other attempt when I tried being more verbose, but then couldn't match anything.

2 Answers2

1

This is probably the regex you are looking for:

[A-Z]{2}\s?[0-9]{2}\s?[A-Z]{2}\s?[0-9]{4}

Note that it allows multiple whitespace characters.

Sufian Latif
  • 13,086
  • 3
  • 33
  • 70
1

You are close; need to put a square brackets around A-Z to let {2} affect the whole range instead of only Z. As it stands it literally matches A-ZZ.

So

template = "[A-Z]{2}\s?(\d{2})\s?([A-Z]{2})\s?(\d{4})"

should do. We use [ instead of ( to imply a range of letters. If we put (, it would try to match A-ZA-Z i.e. literally A-Z two times.

You can see a demo here and you can change them to ( or omit them to see the effect in the demo.

Mustafa Aydın
  • 17,645
  • 4
  • 15
  • 38