Maybe split is not accurate because splitting requires a single character delimiter and there is no delimiter for the third element.
Trying capturing rather than splitting may work better (though I don't know if it is wise from performance point of view).
You could try this:
var pattern = /(([^.,]+?)([.,]|\{\})) */g;
var captures = [];
var s = 'First capture.Second capture, $THIRD_CAPTURE{} fourth capture.';
while ( (match = pattern.exec(s)) != null ) {
if (match[3] == "." || match[3] == ",") {
captures.push(match[2]);
} else {
captures.push(match[1]);
}
}
console.log(captures);
var captures = [];
var s = 'The cats climbed the tall tree.In this sentence, $U_SEL{} is a noun.';
while ( (match = pattern.exec(s)) != null ) {
if (match[3] == "." || match[3] == ",") {
captures.push(match[2]);
} else {
captures.push(match[1]);
}
}
console.log(captures);
The principle is as below.
- Capture blocks of either a part of the sentence ended by a dot or a comma, without inner dot or comma, or ending with empty brackets pair
- Within each block, capture both the content and the ending (either a dot, a comma or an empty brackets pair)
For each resulting match, you have three captures:
- At index 1, the first block
- At index 3, the ending
- At index 2, the content without the ending
Then, according to the ending, either the match of idx 1 or 2 is stored.
You could modify the loop selecting the match to get exactly what you want, with the dot on the first capture and not on the last one, unless it is a typo.