0

I have a set of strings that have some letters, occasional one number, and then somewhere 2 or 3 numbers. I need to match those 2 or 3 numbers. I have this:

\w*(\d{2,3})\w*

but then for strings like

AAA1AAA12A
AAA2AA123A

it matches '12' and '23' respectively, i.e. it fails to pick the three digits in the second case. How do I get those 3 digits?

miguello
  • 544
  • 5
  • 15
  • Since not all regex engines are created equal, please provide a language tag. – WJS Jul 12 '22 at 21:04
  • 1
    Using `\w` also matches `\d` What are the expected matches for `AAA2AA123AAAA2AA123A` and should only 3 digits also match? What is the tool or language? – The fourth bird Jul 12 '22 at 21:30
  • @Thefourthbird The specification for the input string says that there is only one group of 2 or 3 digits. All other digits are single digits. I need the only group of 2 or 3 digits – miguello Jul 13 '22 at 16:49
  • 1
    @WJS Included. It's JSL - I'm not sure which flavor it uses though – miguello Jul 13 '22 at 16:49

2 Answers2

3

Here is how you would do it in Java.

  • the regex simply matches on a group of 2 or 3 digits.
  • the while loop uses find() to continue finding matches and the printing the captured match. The 1 and the 1223 are ignored.
String s=   "AAA1AAA12Aksk2ksksk21sksksk123ksk1223sk";
String regex = "\\D(\\d{2,3})\\D";
Matcher  m = Pattern.compile(regex).matcher(s);
while (m.find()) {
    System.out.println(m.group(1));
}

prints

12
21
123
WJS
  • 36,363
  • 4
  • 24
  • 39
  • Note that you don't need the non capture groups `(?:` as you are not using for example a quantifier or alternation. Specifying `\D` on the left and right means that there has to be a non digit on the left and on the right. It depends on if the OP wants to match multiple occurrences. If there has to be a word char excluding a digit on the left or right OR the start or end of the string `(?<=[^\W\d]|^)\d{2,3}(?=[^\W\d]|$)` https://regex101.com/r/k70ocQ/1 You could also write your pattern as `(?<!\d)\d{2,3}(?!\d)` if other characters on the left and right are also allowed, and lookarounds support – The fourth bird Jul 12 '22 at 21:28
  • 1
    You're right! Fixed. Thanks. – WJS Jul 12 '22 at 21:32
2

Looks like the correct answer would be:

    \w*?(\d{2,3})\w*

Basically, making preceding expression lazy does the job

miguello
  • 544
  • 5
  • 15