12

Let's say i have this text : "AAAA1 AAA11 AA111AA A1111 AAAAA AAAA1111".

I want to find all occurrences matching these 3 criteria :
-Capital letter 1 to 4 times
-Digit 1 to 4 times
-Max number of characters to be 5

so the matches would be :
{"AAAA1", "AAA11", "AA111", "A1111", "AAAA1"}

i tried

([A-Z]{1,4}[0-9]{1,4}){5}

but i knew it would fail, since it's looking for five time my group.

Is there a way to limit result of the groups to 5 characters?

Thanks

David says Reinstate Monica
  • 19,209
  • 22
  • 79
  • 122
sabatmonk
  • 320
  • 1
  • 2
  • 10

1 Answers1

17

You can limit the character count with a look ahead while checking the pattern with you matching part.

If you can split the input by whitespace you can use:

^(?=.{2,5}$)[A-Z]{1,4}[0-9]{1,4}$

See demo here.

If you cannot split by whitespace you can use capturing group with (?:^| )(?=.{2,5}(?=$| ))([A-Z]{1,4}[0-9]{1,4})(?=$| ) for example, or lookbehind or \K to do the split depending on your regex flavor (see demo).


PREVIOUS ANSWER, wrongly matches A1A1A, updated after @a_guest remark.

You can use a lookahead to check for your pattern, while limiting the character count with the matching part of the regex:

(?=[A-Z]{1,4}[0-9]{1,4}).{2,5}

See demo here.

Robin
  • 9,415
  • 3
  • 34
  • 45
  • This matches for example `A1A1A` as well (because the look ahead distance is smaller than the matching distance) which seems to be undesired according to the OP (considering examples + description). Instead it seems the regex should match a sequence of letters *followed* by a sequence of digits (but not intermingled with), each sequence having length `1-4` and the overall combination having max. length `5`. – a_guest Oct 09 '18 at 14:56
  • thanks @a_guest, I updated the answer to reflect that. – Robin Oct 10 '18 at 13:24
  • Thank you for the correction. It works! – Chris Wong Dec 10 '21 at 23:42