I would like to match all strings that start with 1 to 4 (lower case) letters followed by 1 to 4 digits and the overall length of that sequence (letters + digits) should be 5. The letters and digits must not intermingle. The actual string however is much longer and this 5-sequence is not followed by any distinct word boundary (it can be followed by [a-z0-9]
for example). The regex in question however should only be concerned with the first 5 characters.
For example:
- Positive matches:
a1111
,aa111
,abc12def
,abc12345
, ... - Negative matches:
a1a1a
,aa11a
,aa11
,1aaaa x
, ...
So I would need something like ^([a-z]{1,4})[0-9]{5 - length of \1}
.
This question seems to be slightly related but I couldn't figure out how to make the length of the second group dependent on the first. This answer suggests to perform a lookahead on all the possible characters but doesn't prevent intermingling.
I don't want to perform a match on only the first five characters of the string (and then check the length of the actual match), since I would like to augment this regex in order to match the remainder of the string with some other pattern.
The length of the groups is small for the sake of the example but they are actually much longer (so manually specifying the various combinations is not an option; auto-generating a regex that contains all the combinations makes me worry about performance).
Specifically I am using Python 3.6 but I am happy about solutions considering other regex flavors as well.