1

I hava such this list

List<String> lista = Arrays.asList("[2017-07-01,1234567890,1234567890,06CA...]",
                                   "[2017-07-02,1234567890,txt,06CA...]");

I tried this regex

(\\d{4}-\\d{2}-\\d{2})(\\,)(\\w+)(\\,)(\\w+)(\\,)(\\w+)\\(w+)

But this give me

Group 0 : 2017-07-02,1234567890,1234567891,02BA...
Group 1 : 2017-07-02
Group 2 : ,
Group 3 : 1234567890
Group 4 : , 
Group 5 : 1234567891
Group 6 : 06CA...

I expect output such this:

Group 0: date
Group 1: no1
Group 2: no2 or txt
Group 3: code
azro
  • 53,056
  • 7
  • 34
  • 70
Wasfy
  • 73
  • 8
  • 3
    You have a typo - `\\(w+)` must be `(\\w+)` (at the end), see https://regex101.com/r/ODacey/1. BTW, there are too many capturing groups, why use `(,)`? Try just [`"(\\d{4}-\\d{2}-\\d{2}),(\\w+),(\\w+),(\\w+)"`](https://regex101.com/r/ODacey/2) – Wiktor Stribiżew Jul 31 '17 at 09:17
  • 2
    Why don't you simply split on comma? No need to use sophisticated regex for this. – dpr Jul 31 '17 at 09:29
  • I want each as group – Wasfy Jul 31 '17 at 09:34
  • @Wasfy What "each"? Are you trying to make a Java regex return [*repeated capturing group*](http://www.regular-expressions.info/captureall.html) submatches? It is not possible to achieve with a Java regex. You should really strip the `[` and `]` at both ends, and then `split("\\s*,\\s*")` – Wiktor Stribiżew Jul 31 '17 at 09:35
  • It seems this question is a dupe of [Java regex: Repeating capturing groups](https://stackoverflow.com/questions/6939526/java-regex-repeating-capturing-groups) – Wiktor Stribiżew Jul 31 '17 at 09:38
  • @WiktorStribiżew I don't see ANY indication of wanting repeating capturing groups. I do agree with the fact that OP should strip the `[` and `]` and use split though. – Kaamil Jasani Jul 31 '17 at 09:48
  • Thanks all to sharing as @SchoolBoy solve the problem. – Wasfy Jul 31 '17 at 10:15

1 Answers1

1

The regex matcher always returns groups as follows:

Group 0: Full match
Group 1: Capture group 1
Group 2: ...

The reason you are ending up with extra groups is because you are also capturing the commas.

This regex will result in the following groups:

(\\d{4}-\\d{2}-\\d{2}),(\\w+),(\\w+),(\\w+)

Group 0: Full match (i.e. everything in the square brackets)
Group 1: Date
Group 2: No 1
Group 3: No 2 or text
Group 4: Code

However, a better way to do this may be to strip the [ and ] by doing:

yourString.substring(1, yourString.length() - 1);

And then splitting it by doing:

stripped.split(",");

This will result in an array in exactly the format that you expected:

[0] -> Date
[1] -> No 1
[2] -> No 2 or Text
[3] -> Code
Kaamil Jasani
  • 464
  • 5
  • 11