So I'm completely new to regular expressions, and I'm trying to use Java's java.util.regex
to find punctuation in input strings. I won't know what kind of punctuation I might get ahead of time, except that (1) !, ?, ., ... are all valid puncutation, and (2) "<" and ">" mean something special, and don't count as punctuation.
The program itself builds phrases pseudo-randomly, and I want to strip off the punctuation at the end of a sentence before it goes through the random process.
I can match entire words with any punctuation, but the matcher just gives me indexes for that word. In other words:
Pattern p = Pattern.compile("(.*\\!)*?");
Matcher m = p.matcher([some input string]);
will grab any words with a "!"
on the end. For example:
String inputString = "It is a warm Summer day!";
Pattern p = Pattern.compile("(.*\\!)*?");
Matcher m = p.matcher(inputString);
String match = inputString.substring(m.start(), m.end());
results in --> String match ~ "day!"
But I want to have Matcher
index just the "!"
, so I can just split it off.
I could probably make cases, and use String.substring(...)
for each kind of punctuation I might get, but I'm hoping there's some mistake in my use of regular expressions to do this.