Looking for some regex help. I'm looking for a method in Java to split up some input text by words, but also keep the delimiters (whitespace, punctuation). Another way to put it would be to split words into their own indexes and other non-word characters could be in other indexes of the array.
This input text:
"Hello, this isn't working!"
Should be put into an array like this:
{"Hello", ",", "this", "isn't", "working", "!"}
or
{"Hello", ", ", "this", " ", "isn't", " ", "working", "!"}
I've done basically the same thing in Python using this:
def split_input(string):
return re.findall(r"[\w']+|[\s.,!?;:-]", string)
But I've yet to find a way to accomplish the same thing in Java. I've tried String.split()
with lookahead/lookbehind and I've tried pattern matchers but haven't had much luck.
Any help would be much appreciated!