If we only had to count special characters and vowels, we could use something like this:
Map<String,Long> result;
try(Stream<String> lines = Files.lines(path)) {
result = lines
.flatMap(Pattern.compile("\\s+")::splitAsStream)
.flatMapToInt(String::chars)
.filter(c -> !Character.isAlphabetic(c) || "aeiou".indexOf(c) >= 0)
.mapToObj(c -> "aeiou".indexOf(c)>=0? "totalVowelCount": "totalSpecialCharacter")
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
}
First we flatten the stream of lines to a stream of words, then to a stream of characters, to group them by their type. This works smoothly as “special character” and “vowel” are mutual exclusive. In principle, the flattening to words could have been omitted if we just extend the filter to skip white-space characters, but here, it helps getting to a solution counting words.
Since words are a different kind of entity than characters, counting them in the same operation is not that straight-forward. One solution is to inject a pseudo character for each word and count it just like other characters at the end. Since all actual characters are positive, we can use -1
for that:
Map<String,Long> result;
try(Stream<String> lines = Files.lines(path)) {
result = lines.flatMap(Pattern.compile("\\s+")::splitAsStream)
.flatMapToInt(w -> IntStream.concat(IntStream.of(-1), w.chars()))
.mapToObj(c -> c==-1? "totalWordCount": "aeiou".indexOf(c)>=0? "totalVowelCount":
Character.isAlphabetic(c)? "totalAlphabetic": "totalSpecialCharacter")
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
}
This adds a "totalAlphabetic"
category in addition to the others into the result map. If you do not want that, you can insert a .filter(cat -> !cat.equals("totalAlphabetic"))
step between the mapToObj
and collect
steps. Or use a filter like in the first solution before the mapToObj
step.
As an additional note, this solution does more work than necessary, because it splits the input into lines, which is not necessary as we can treat line breaks just like other white-space, i.e. as a word boundary. Starting with Java 9, we can use Scanner
for the job:
Map<String,Long> result;
try(Scanner scanner = new Scanner(path)) {
result = scanner.findAll("\\S+")
.flatMapToInt(w -> IntStream.concat(IntStream.of(-1), w.group().chars()))
.mapToObj(c -> c==-1? "totalWordCount": "aeiou".indexOf(c)>=0? "totalVowelCount":
Character.isAlphabetic(c)? "totalAlphabetic": "totalSpecialCharacter")
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
}
This will split the input into words in the first place without treating line breaks specially. This answer contains a Java 8 compatible implementation of Scanner.findAll
.
The solutions above consider every character which is neither white-space nor alphabetic as “special character”. If your definition of “special character” is different, it should not be too hard to adapt the solutions.