11

I am writing a parser for a file in Java 8. The file is read using Files.lines and returns a sequential Stream<String>.

Each line is mapped to a data object Result like this:

Result parse(String _line) {
  // ... code here
  Result _result = new Result().
  if (/* line is not needed */) {
    return null;
  } else {
    /* parse line into result */
   return _result;
  }
}

Now we can map each line in the stream to its according result:

public Stream<Result> parseFile(Path _file) {
  Stream<String> _stream = Files.lines(_file);
  Stream<Result> _resultStream = _stream.map(this::parse);
}

However the stream now contains null values which I want to remove:

parseFile(_file).filter(v -> v != null);

How can I combine the map/filter operation, as I already know in parseLine/_stream.map if the result is needed?

Captain Man
  • 6,997
  • 6
  • 48
  • 74
user3001
  • 3,437
  • 5
  • 28
  • 54
  • 2
    I don't get it, what's wrong with `return Files.line(_file).map(this::parse).filter(v -> v != null);` ? – user2336315 Nov 06 '14 at 12:44
  • Well I assume the stream has to be processed two times, once for map and once for filter. I want to discard the unnessecary elements within the map operation, that should be faster in any case. – user3001 Nov 06 '14 at 12:56
  • 3
    The stream will be processed in one run and only if you use a terminal operation that requires full iteration (e.g. forEach, collect, reduce). – Lukasz Wiktor Nov 06 '14 at 13:13
  • 1
    @user3001 See http://stackoverflow.com/questions/23696317/java-8-find-first-element-by-predicate/23696571 – user2336315 Nov 06 '14 at 13:36
  • 1
    See also: http://stackoverflow.com/questions/21219667/stream-and-lazy-evaluation – assylias Nov 06 '14 at 13:43
  • 4
    Your assumption about multiple passes is incorrect. Filtering and mapping are processed in a single pass. (In general, the entire pipeline is processed in one pass, unless there are operations like sorting that must see all the data before yielding any data.) – Brian Goetz Nov 06 '14 at 14:14

2 Answers2

14

As already pointed out in the comments the stream will be processed in one pass, so there isn't really a need to change anything. For what it's worth you could use flatMap and let parse return a stream:

Stream<Result> parse(String _line) {
  .. code here
  Result _result = new Result().
  if (/* line is not needed */) {
    return Stream.empty();
  } else {
    /** parse line into result */
   return Stream.of(_result);
  }
}  

public Stream<Result> parseFile(Path _file) {
  return Files.lines(_file)
              .flatMap(this::parse);
}

That way you won't have any null values in the first place.

a better oliver
  • 26,330
  • 2
  • 58
  • 66
7

Updating for Java 9:

Using Stream<Result> seems like the wrong return type for the parse() function. A stream can contain many, many values, so the user of parse() either has to assume there will be at most one value in the stream, or use something like collect to extract and use the results of the parse() operation. If the function and its usage are only separated by a few lines of code, this may be fine, but if the distance increases, such as in a completely different file for JUnit testing, the interface contract isn't clear from the return value.

Instead of returning a Stream, it would be a better interface contract to return an empty Optional when the line is not needed.

Optional<Result> parse(String _line) {
   ... code here
   Result _result = null;
   if (/* line needed */) {
      /** parse line into result */
   }
   return Optional.ofNullable(_result);
}

Unfortunately, now _stream.map(this::parse) returns a stream of Optional values, so with Java 8, again you'd need to filter and map this with .filter(Optional::isPresent).map(Optional::get), and the question was looking for a solution which could do this "in one go".

This question was posted 3 years ago. With Java 9, we now have the option (pun intended) of using the Optional::stream method, so we can instead write:

public Stream<Result> parseFile(Path _file) {
  return Files.lines(_file)
      .map(this::parse)
      .flatMap(Optional::stream)
}

to transform the stream of Optional values into a stream of Result values, without any of the empty optionals.

AJNeufeld
  • 8,526
  • 1
  • 25
  • 44
  • In that case you're replacing the `map&filter` into a `map&flatMap`. There's not much difference to using the `null` directly and then filtering it out. – Alowaniak May 20 '21 at 11:06