You can't use split that easily - split will eliminate the separators and give you only the things in between. As you need the separators, no can do.
One real dirty trick is to use something called 'lookahead'. That argument you pass to split is a regular expression. Most 'characters' in a regexp have the property that they consume the matching input. If you do input.split("\\s+")
then that doesn't 'just' split on whitespace, it also consumes them: The whitespace is no longer part of the individual entries in your string array.
However, consider ^
and $
. or \\b
. These still match things but don't consume anything. You don't consume 'end of string'. In fact, ^^^hello$$$
matches the string "hello"
just as well. You can do this yourself, using lookahead: It matches when the lookahead is there but does not consume it:
String[] args = "Hello World$Huh Weird".split("(?=[\\s_$-]+)");
for (String arg : args) System.out.println("*" + args[i] + "*");
Unfortunately, this 'works', in that it saves your separators, but isn't getting you all that much closer to a solution:
*Hello*
* World*
*$Huh*
* *
* *
* Weird*
You can go with lookbehind as well, but it's limited; they don't do variable length, for example.
The conclusion should rapidly become: Actually, doing this with split
is a mistake.
Then, once split is off the table, you should no longer use streams, either: Streams don't do well once you need to know stuff about the previous element in a stream to do the job: A stream of characters doesn't work, as you need to know if the previous character was a non-letter or not.
In general, "I want to do X, and use Y" is a mistake. Keep an open mind. It's akin to asking: "I want to butter my toast, and use a hammer to do it". Oookaaaaayyyy, you can probably do that, but, eh, why? There are butter knives right there in the drawer, just.. put down the hammer, that's toast. Not a nail.
Same here.
A simple loop can take care of this, no problem:
private static final String BREAK_CHARS = "&-_`";
public String toTitleCase(String input) {
StringBuilder out = new StringBuilder();
boolean atBreak = true;
for (char c : input.toCharArray()) {
out.append(atBreak ? Character.toUpperCase(c) : c);
atBreak = Character.isWhitespace(c) || (BREAK_CHARS.indexOf(c) > -1);
}
return out.toString();
}
Simple. Efficient. Easy to read. Easy to modify. For example, if you want to go with 'any non-letter counts', trivial: atBreak = Character.isLetter(c);
.
Contrast to the stream solution which is fragile, weird, far less efficient, and requires a regexp that needs half a page's worth of comment for anybody to understand it.
Can you do this with streams? Yes. You can butter toast with a hammer, too. Doesn't make it a good idea though. Put down the hammer!