2

I wrote this method:

public static void main(String... args) {
    try (var linesStream = Files.lines(Paths.get("C:\\Users\\paul\\Desktop\\java.txt"))) {
        Stream<String> words = linesStream.
                flatMap(line -> Arrays.stream(line.split(" ")))
                .distinct();
        System.out.println("There are " + words.count() + " distinct words in this file, here they are:");
        words.forEach(System.out::println);
    } catch (IOException e) {
        System.err.println(e.getMessage());
    }
}

The problems I have here is that I operate on the words Stream<String> twice. In order to do that do you have to explicitly rebuild this stream, or is there some magic reset method I could use?

Also, in order to rebuild the words stream again, I have to rebuild the linesStream and wrap that into another try/catch block here... Very verbose. What is a method to make this type of things easier to write?

I guess I could do:

    static Stream<String> getStreamFromFile() throws IOException {
        return Files.lines(Paths.get("C:\\Users\\paul\\Desktop\\java.txt"));
    }

    static Stream<String> getDistinctWords(Stream<String> lines) {
        return lines
                .flatMap(line -> Arrays.stream(line.split(" ")))
                .distinct();
    }

    public static void main(String... args) {
        Stream<String> lines1 = null;
        Stream<String> lines2 = null;
        try {
            lines1 = getStreamFromFile();
            lines2 = getStreamFromFile();
            Stream<String> distinctWords1 = getDistinctWords(lines1);
            Stream<String> distinctWords2 = getDistinctWords(lines2);
            System.out.println("There are " + distinctWords1.count() + " distinct words in this file, here they are:");
            distinctWords2.forEach(System.out::println);
        } catch (IOException e) {
            System.err.println(e.getMessage());
        } finally {
            lines1.close();
            lines2.close();
        }
    }

but is this all I am left with?

Coder-Man
  • 2,391
  • 3
  • 11
  • 19
  • You might also want to read the answers to https://stackoverflow.com/q/28459498/6395627 – Slaw May 27 '18 at 18:23

3 Answers3

3

You can't re-use streams. Just collect the elements into a collection, e.g. a List, or call a (stateful) function which outputs each element and also increments a count.

jon hanson
  • 8,722
  • 2
  • 37
  • 61
3

You can't reset a Stream, but you can collect the results of your distinct(); and you can also use \\s+ as a regex. Like,

static List<String> getDistinctWords(Stream<String> lines) {
    return lines.flatMap(line -> Arrays.stream(line.split("\\s+"))).distinct()
            .collect(Collectors.toList());
}

And then change your caller like

List<String> distinctWords = getDistinctWords(lines);
System.out.println("There are " + distinctWords.size() 
        + " distinct words in this file, here they are:");
distinctWords.forEach(System.out::println);

And you shouldn't hard code paths like that, you can use the user.home system property to locate your file. Like,

return Files.lines(Paths.get(System.getProperty("user.home"), "Desktop/java.txt"));
Elliott Frisch
  • 198,278
  • 20
  • 158
  • 249
2

The problem really is that streams do not support multiple terminal operations to be invoked on them, which is an unfortunate limitation.

The closest alternative is to collect your processed data into a collection and run the same operations:

List<String> distinctWords = getDistinctWords(lines1)
              .collect(Collectors.toList());

System.out.println("There are " + distinctWords.size() + 
        " distinct words in this file, here they are:");
distinctWords.forEach(System.out::println);

Another approach would be to use stateful behavior, where operations performed during the stream traversal have side-effects:

AtomicLong al = new AtomicLong();
getDistinctWords(lines1).forEach(string -> {
    al.incrementAndGet();
    System.out.println(string);
});

System.out.println("There are " + al.get() + 
        " distinct words in this file, here they are:");

Stateful behavior in streams should be used with caution. The documentation of the java.util.stream package has a lot of information about this. But I believe that in this case, the side-effects would not be undesired.

ernest_k
  • 44,416
  • 5
  • 53
  • 99
  • 2
    this limitation is anything but unfortunate, its a very much thought design decision. The comment under the question has the link for it – Eugene May 27 '18 at 19:06
  • @Eugene In that sense it can also be argued that it's not a limitation at all (understandably so). I understand that it was a choice made for many other features of the API. It's only unfortunate as far as we want to but can't run multiple terminal operations, I don't mean it was a bad design decision. – ernest_k May 27 '18 at 19:13
  • 1
    point understood, I thought u were refering different to it. you "can" have multiple terminal operation, sort of, via a custom collector, but not out of the box – Eugene May 27 '18 at 19:15