0

I have a very large data set, and I want the fastest way to get every nth line (for example, if the file is 1M lines long, I'd want every 1000th line).

Ideally I'm looking for a way to jump to each line number, but I haven't found a way to do that yet.

My work around is to split the original data file (using the Unix "split" command) then take the top line of each.

I'm curious if there is a way to jump to a specific line number in Java without iterating through other lines in the file. If not, is it more efficient to split the file, or use BufferedReader until I get to my desired line?

Any help is greatly appreciated!

AlyssaKm
  • 28
  • 7
  • [Possible duplicate](http://stackoverflow.com/questions/2312756/in-java-how-to-read-from-a-file-a-specific-line-given-the-line-number) – UnknownOctopus Jul 22 '15 at 04:19

1 Answers1

2

Spiltting into subfiles has nothing to recommend it. It adds latency and wastes space. It's the same work as your first solution plus more.

You can read millions of lines a second with BufferedReader. Do it the simple way. Use a LineNumberReader, which extends BufferedReader, and read lines until the line count is the one(s) you want.

user207421
  • 305,947
  • 44
  • 307
  • 483