9

Is there an equivalent to getLineNumber() for Streams in Java 8?

I want to search for a word in a textfile and return the line number as Integer. This is my search Method:

result = Files.lines(Paths.get(fileName))
            .filter(w -> w.contains(word))
            .collect(Collectors.<String> toList());
wundidajah
  • 177
  • 2
  • 6

3 Answers3

9

I don't think there is, because streams are not designed to provide an access to their elements, not like collections.

One workaround would be to read the file in the list, then use an IntStream to generate the corresponding indices, from which you can then apply your filter:

List<String> list =  Files.readAllLines(Paths.get("file"));

//readAllLines current implementation returns a RandomAccessList so 
//using get will not have a big performance impact.
//The pipeline can be safely run in parallel
List<Integer> lineNumbers = 
     IntStream.range(0, list.size())
              .filter(i -> list.get(i).contains(word))
              .mapToObj(i -> i + 1)
              .collect(toList());

It's a bit overkill as you take the risk to load the entire file's content into a list to maybe keep only a few elements after. If it doesn't satisfy you, you can write the good for loop, it's not much code.

Maybe you can be interested in this question Zipping streams using JDK8 with lambda (java.util.stream.Streams.zip). For example, using the proton-pack library:

List<Long> lineNumbers = 
    StreamUtils.zipWithIndex(Files.lines(Paths.get("file")))
               .filter(in -> in.getValue().contains(word))
               .map(in -> in.getIndex() + 1)
               .collect(toList());

Or you can create a LineNumberReader from a BufferedReader, then call lines() and map each line to its line number in the file. Note that this approach will fail if the pipeline is run in parallel, so I don't recommend it.

LineNumberReader numberRdr = new LineNumberReader(Files.newBufferedReader(Paths.get("file")));

List<Integer> linesNumbers = numberRdr.lines()
                                      .filter(w -> w.contains(word))
                                      .map(w -> numberRdr.getLineNumber())
                                      .collect(toList());
Community
  • 1
  • 1
Alexis C.
  • 91,686
  • 21
  • 171
  • 177
6

If you want to keep to efficient lazy nature of Streams (i.e. not read an entire file if you only want to find the first match), you’ll have to construct the stream yourself. This isn’t too hard, the only obstacle is the absence of a tuple type to carry both, a line number and a line String. You can either, abuse Map.Entry instances or create a dedicated type:

static final class NumberedLine {
    final int number;
    final String line;
    NumberedLine(int number, String line) {
        this.number = number;
        this.line = line;
    }
    public int getNumber() {
        return number;
    }
    public String getLine() {
        return line;
    }
    @Override
    public String toString() {
        return number+":\t"+line;
    }
}

then you can implement a stream straight-forward:

public static Stream<NumberedLine> lines(Path p) throws IOException {
    BufferedReader b=Files.newBufferedReader(p);
    Spliterator<NumberedLine> sp=new Spliterators.AbstractSpliterator<NumberedLine>(
        Long.MAX_VALUE, Spliterator.ORDERED|Spliterator.NONNULL) {
            int line;
            public boolean tryAdvance(Consumer<? super NumberedLine> action) {
                String s;
                try { s=b.readLine(); }
                catch(IOException e){ throw new UncheckedIOException(e); }
                if(s==null) return false;
                action.accept(new NumberedLine(++line, s));
                return true;
            }
        };
    return StreamSupport.stream(sp, false).onClose(()->{
        try { b.close(); } catch(IOException e){ throw new UncheckedIOException(e); }});
}

using the method you may search for the first occurrence

OptionalInt lNo=lines(path).filter(nl->nl.getLine().contains(word))
                           .mapToInt(NumberedLine::getNumber)
                           .findFirst();

or collect all of them

List<Integer> all=lines(path).filter(nl->nl.getLine().contains(word))
                             .map(NumberedLine::getNumber)
                             .collect(Collectors.toList());

Or, well in production code you want to ensure appropriate closing of the underlying resources:

OptionalInt lNo;
try(Stream<NumberedLine> s=lines(path)) {
    lNo=s.filter(nl->nl.getLine().contains(word))
         .mapToInt(NumberedLine::getNumber)
         .findFirst();
}

resp.

List<Integer> all;
try(Stream<NumberedLine> s = lines(path)) {
    all = s.filter(nl->nl.getLine().contains(word))
            .map(NumberedLine::getNumber)
            .collect(Collectors.toList());
}
Holger
  • 285,553
  • 42
  • 434
  • 765
  • +1 for the spliterator implementation. It's rather infortunate that the zip method was removed (I guess due to the `parallel()` feature). In one hand I like this way to make it "easy" to parallelize your task as long as you are aware of potential side-effects or failures, and in an other hand, the Stream API could have been more rich without it, but I guess there are other points I'm still unexperienced with or unaware of that have made this decision.. – Alexis C. Apr 27 '15 at 09:58
3

I think in this case the simplest you can do is to get an iterator from the stream, and do the old-school search:

    Iterator<String> iterator = Files.lines(Paths.get(fileName)).iterator();

    int lineNumber = 1;
    while (iterator.hasNext()) {
        if(iterator.next().contains(word)) {
            break;
        }
        lineNumber++;
    }

With this solution you don't read the whole file into memory just in order to be able to use stream operations.

lbalazscs
  • 17,474
  • 7
  • 42
  • 50