2

I have to find all longest words from given file using Streams API. I did it in few steps but looking for some "one liner", actually I processing whole file two times, first to find max length of word and second for comparing all to max length, assuming its not the best for performance ; P Could someone help me? Just look at code:

public class Test {
    public static void main(String[] args) throws IOException {
        List<String> words = Files.readAllLines(Paths.get("alice.txt"));
        OptionalInt longestWordLength = words.stream().mapToInt(String::length).max();
        Map<Integer, List<String>> groupedByLength = words.stream().collect(Collectors.groupingBy(String::length));
        List<String> result = groupedByLength.get(longestWordLength.getAsInt());
    }
}

I wish to make it straight:

List<String> words = Files.readAllLines(Paths.get("alice.txt"));
List<String> result = // code

File contains just one word per line, anyway it's not important - question is about the right stream code.

Mureinik
  • 297,002
  • 52
  • 306
  • 350
minizibi
  • 653
  • 3
  • 13
  • 28
  • 1
    Does it have to be with streams? – Oleg Sep 21 '17 at 10:36
  • Iterating two times is not always the worst thing to do. What’s more offending is this entirely unnecessary `groupingBy`, just to get the one group afterward, instead of a straight-forward `int longest = longestWordLength.getAsInt(); List result = list.stream() .filter(s -> s.length() == longest) .collect(toList());` – Holger Sep 21 '17 at 15:15

3 Answers3

7

Instead of just keeping the largest length, you could collect the words into a map from their length to the words, and then just take the longest one:

List<String> longestWords =
    Files.lines(Paths.get("alice.txt"))
         .collect(Collectors.groupingBy(String::length))
         .entrySet()
         .stream()
         .sorted(Map.Entry.<Integer, List<String>> comparingByKey().reversed())
         .map(Map.Entry::getValue)
         .findFirst()
         .orElse(null);

EDIT:
As Malte Hartwig noted, using max on the streamed map is much more elegant (and probably faster):

List<String> longestWords =
    Files.lines(Paths.get("alice.txt"))
         .collect(Collectors.groupingBy(String::length))
         .entrySet()
         .stream()
         .max(Map.Entry.comparingByKey())
         .map(Map.Entry::getValue)
         .orElse(null);

EDIT2:
There's a built-in inefficiency in both the above solutions, as they both build a map a essentially store the lengths for all the strings in the file instead of just the longest ones. If performance is more important than elegance in your usecase, you could write your own Collector to just preserve the longest strings in list:

private static int stringInListLength(List<String> list) {
    return list.stream().map(String::length).findFirst().orElse(0);
}

List<String> longestWords =
    Files.lines(Paths.get("alice.txt"))
         .collect(Collector.of(
             LinkedList::new,
             (List<String> list, String string) -> {
                 int stringLen = string.length();
                 int listStringLen = stringInListLength(list);
                 if (stringLen > listStringLen) {
                     list.clear();
                 }
                 if (stringLen >= listStringLen) {
                     list.add(string);
                 }
             },
             (list1, list2) -> {
                 int list1StringLen = stringInListLength(list1);
                 int list2StringLen = stringInListLength(list2);
                 if (list1StringLen > list2StringLen) {
                     return list1;
                 }
                 if (list2StringLen > list1StringLen) {
                     return list2;
                 }
                 list1.addAll(list2);
                 return list1;
             }
         ));
Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • maan you're amazing, i was so close with my own solution. thanks! – minizibi Sep 21 '17 at 10:45
  • By adding the `sorted` operation you're introducing a bit of overhead (`nlogn` operations for sorting at best) that can be reduced to `n` operations with a `reduce` – Alberto S. Sep 21 '17 at 10:53
  • 2
    @minizibi You can max it a bit easier to read (imo) by using `Stream.max(Comparator)` instead of `sorted().findFirst()`: `...entrySet().stream().max(Map.Entry.comparingByKey()).map(Map.Entry::getValue).orElse(null)`. This saves a line, is more straightforward (no reversed call e.g.) and faster (I imagine max to be faster than sort, as it doesn't need as many comparisons). – Malte Hartwig Sep 21 '17 at 11:30
  • @MalteHartwig d`oh! Don't know what I was thinking there. Thanks! – Mureinik Sep 21 '17 at 11:34
  • 3
    That `stringInListLength` could be implemented as `list.isEmpty()? 0: list.get(0).length()`… Further, [this](https://stackoverflow.com/a/29339106/2711488) might be a good read. – Holger Sep 21 '17 at 15:11
3

reduce will help you:

    Optional<String> longest = words.stream()
            .reduce((s1, s2) -> {
                if (s1.length() > s2.length())
                    return s1;
                else
                    return s2;
            });

In case the Stream is empty it will return an Optional.empty

In case you want the list of all the words that have the maximum length this piece will help you:

    Optional<List<String>> longest = words.stream()
            .collect(Collectors.groupingBy(
                    String::length,
                    Collectors.toList()
            ))
            .entrySet()
            .stream()
            .reduce(
                    (entry1, entry2) -> {
                        if (entry1.getKey() > entry2.getKey())
                            return entry1;
                        else
                            return entry2;
                    }
            )
            .map(Map.Entry::getValue);
Alberto S.
  • 7,409
  • 6
  • 27
  • 46
0

iterate over the map keys to find the longest wordlength

Linusk
  • 71
  • 4