0

My regular expression looks like this: "[a-zA-Z]+[ \t]*(?:,[ \t]*(\\d+)[ \t]*)*"

I can match the lines with this, but I don't know how to capture the numbers,I think it has to do something with grouping.

For example: from the string "asd , 5 ,2,6 ,8", how to capture the numbers 5 2 6 and 8?

A few more examples:

sdfs6df -> no capture

fdg4dfg, 5 -> capture 5

fhhh3      ,     6,8    , 7 -> capture 6 8 and 7

asdasd1,4,2,7 -> capture 4 2 and 7

So I can continue my work with these numbers. Thanks in advance.

remmaks
  • 35
  • 4
  • Is there any constraints ? Or just get the numbers ? – azro Apr 25 '20 at 21:22
  • Just use `"\\d+"` with `Matcher#find()` in a loop. See [How to extract numbers from a string and get an array of ints?](https://stackoverflow.com/questions/2367381/how-to-extract-numbers-from-a-string-and-get-an-array-of-ints) – Wiktor Stribiżew Apr 25 '20 at 21:22
  • So don't bother you for the rest of the string just \\d+ – azro Apr 25 '20 at 21:25
  • @remmaks Do you mean like this? `(?:\w+|\G(?!^))\h*,\h*([0-9]+)` https://regex101.com/r/n14Shg/1 – The fourth bird Apr 25 '20 at 21:55
  • @Thefourthbird yeah, that's nice, thanks. – remmaks Apr 25 '20 at 21:58
  • @WiktorStribiżew I think the OP was looking for a more specific way to get the matches. Do you agree if I reopen it and post it? – The fourth bird Apr 25 '20 at 22:06
  • @remmaks Does [this solution](https://stackoverflow.com/a/2367418/3832970) work for you? Does it produce the result you want with all your test cases? – Wiktor Stribiżew Apr 25 '20 at 22:08
  • @WiktorStribiżew No, that is not enough for me unfortunately. But maybe my description was bad, but thanks anyway! Thefourbird's solution is what i was searching for. – remmaks Apr 25 '20 at 22:18
  • [Your comment](https://stackoverflow.com/questions/61432785/in-java-with-regular-expressions-how-to-capture-numbers-from-a-string-with-unkn?noredirect=1#comment108673162_61432785) was misleading, please remove it. @Thefourthbird Please post. – Wiktor Stribiżew Apr 25 '20 at 22:19
  • So yes, the problem was that the first word in the string can contain numbers too, and i do not want to capture those. Sorry for bad description. – remmaks Apr 25 '20 at 22:20

1 Answers1

1

You could match the leading word characters and make use of the \G anchor capturing the continuous digits after the comma.

Pattern

(?:\w+|\G(?!^))\h*,\h*([0-9]+)

Explanation

  • (?: Non capture group
  • \w+ Match 1+ word chars -| or
    • \G(?!^) Assert postition at the end of previous match, not at the start
  • ) Close non capturing group
  • \h*,\h* Match a comma between horizontal whitespace chars
  • ([0-9]+) Capture group 1, match 1+ digits

Regex demo | Java demo

In Java with double escaped backslashes:

String regex = "(?:\\w+|\\G(?!^))\\h*,\\h*([0-9]+)";

Example code

String regex = "(?:\\w+|\\G(?!^))\\h*,\\h*([0-9]+)";
String string = "sdfs6df -> no capture\n\n"
     + "fdg4dfg, 5 -> capture 5\n\n"
     + "fhhh3      ,     6,8    , 7 -> capture 6 8 and 7\n\n"
     + "asdasd1,4,2,7 -> capture 4 2 and 7";

Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println(matcher.group(1));
}

Output

5
6
8
7
4
2
7
The fourth bird
  • 154,723
  • 16
  • 55
  • 70