I want to understand how the group function in java regex works.
When I use the regex
([\u25CB\u25CF])\s+([\u25CB\u25CF])\s+([\u25CB\u25CF])\s+([\u25CB\u25CF])\s+
on the Text
Prozesse & Methoden
Technische Dokumentation ○ ○ ● ○
OSI Model ○ ○ ● ○
I would expect to have first match look like this: groups 0 to 3 like "○", "○", "●", "○". Four groups with one circle in it.
But in fact it looks like this: "○ ○ ● ○", "○", "○", "●". The groups () only span one character each, how can the first group encompass the whole expression?
When I add an empty group behind my expression it matches like this:
([\u25CB\u25CF])\s+([\u25CB\u25CF])\s+([\u25CB\u25CF])\s+([\u25CB\u25CF])\s+()
"○ ○ ● ○", "○", "○", "●", "○"
The last non empty group is remembered then. I can not understand why.
Tested both with java 1.8 and on website http://www.freeformatter.com/regex-tester.html