How to extract specific lines from a huge data file?

Question

I have a very large data file, about 32GB. The file is made up of about 130k lines, each of which mainly contains numbers, but also has few characters.

The task I need to perform is very clear: I have to extract 20 lines and write them to a new text file.

I know the exact line number for each of the 20 lines that I want to copy.

So the question is: how can I extract the content at a specific line number from the large file? I am on Windows. Is there a tool that can do such sort of operations, or I need to write some code?

If there is no direct way of doing that, I was thinking that a possible approach is to first extract small blocks of the original file (so that each block contains one or more lines to extract) and then use a standard editor to find the lines within each block. In this case, the question would be: how can I split a large file in blocks by line on windows? I use a tool named HJ-Split which works very well with large files, but it can only split by size, not by line.

score 0 · Accepted Answer · edited May 23 '17 at 12:15

0

Install[1] Babun Shell (or Cygwin, but I recommend the Babun), and then use sed command as described here: How can I extract a predetermined range of lines from a text file on Unix?

[1] Installing Babun means actually just unzipping it somewhere, so you don't have to have the Administrator rights on the server.

edited May 23 '17 at 12:15

Community

1
1

answered Jun 22 '15 at 11:29

Jozef Chocholacek

2,874
2
20
25

Thank you, babun works very well. It is super simple to install and it contains a long list of useful unix command (and more). All this at the cost of downloading about 270MB. – Luca Jun 23 '15 at 14:52
@LucaNaso You're welcome. And I agree, Babun is a great tool, I install it as a first thing on every Windows machine I have to tackle with. :-) – Jozef Chocholacek Jun 23 '15 at 15:03

How to extract specific lines from a huge data file?

1 Answers1