4

Assuming I have 15GB log records file, and I would like to iterate over \n terminated lines from this file. What java standard lib / 3rd parties provide clean interface for this operation.

Please note that I'm seeking for an NIO based solution, preferablly using Memory Mapped file access method, as demoed by this question How do I create a Java string from the contents of a file? would would be a perfect solution had it not loaded the whole byte buffer into memory before returning a new String() instance of the buffer. This approach does not work in this case because of the size of the input.

Thank you,
Maxim.

Community
  • 1
  • 1
Maxim Veksler
  • 29,272
  • 38
  • 131
  • 151

3 Answers3

4

Have you considered using a BufferedReader? From the documentation:

Reads text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines.

It has a clean interface for getting \n-terminated strings (BufferedReader.readLine()) and should be fairly efficient since it is buffered.

aioobe
  • 413,195
  • 112
  • 811
  • 826
  • 1
    +1: The bottle neck is likely to be the time it takes to read a 15 GB file from disk. How you do it is unlikely to matter much. So its best to keep it simple. – Peter Lawrey Feb 06 '11 at 18:47
3

IMHO you do not need any NIO for this task. Use regular BufferedReader:

BufferedReader reader = new BufferedReader(new FileReader("myfile.log"));

Then user reader.readLine().

AlexR
  • 114,158
  • 16
  • 130
  • 208
2

It's not NIO based, but I'd take a look at Guava's method CharStreams.readLines(InputSupplier, LineProcessor). It does what you want I'd say:

File file = ...
Foo result = CharStreams.readLines(Files.newReaderSupplier(file, Charsets.UTF_8),
    new LineProcessor<Foo>() {
      public boolean processLine(String line) {
        // do stuff for this line
        return true; // or false if you want to stop processing here
      }

      public Foo getResult() {
        return result; // if you create some result when processing the lines
      }
    });

This uses a callback to allow you to process each line in the file in sequence. It doesn't load the next line in to memory until you're done processing the current one. If you don't want to create some single result object when reading the lines, you can simply use LineProcessor<Void> and have getResult() return null.

ColinD
  • 108,630
  • 30
  • 201
  • 202