0

I have a very large data file, about 32GB. The file is made up of about 130k lines, each of which mainly contains numbers, but also has few characters.

The task I need to perform is very clear: I have to extract 20 lines and write them to a new text file.

I know the exact line number for each of the 20 lines that I want to copy.

So the question is: how can I extract the content at a specific line number from the large file? I am on Windows. Is there a tool that can do such sort of operations, or I need to write some code?

If there is no direct way of doing that, I was thinking that a possible approach is to first extract small blocks of the original file (so that each block contains one or more lines to extract) and then use a standard editor to find the lines within each block. In this case, the question would be: how can I split a large file in blocks by line on windows? I use a tool named HJ-Split which works very well with large files, but it can only split by size, not by line.

Luca
  • 57
  • 1
  • 10

1 Answers1

0

Install[1] Babun Shell (or Cygwin, but I recommend the Babun), and then use sed command as described here: How can I extract a predetermined range of lines from a text file on Unix?

[1] Installing Babun means actually just unzipping it somewhere, so you don't have to have the Administrator rights on the server.

Community
  • 1
  • 1
Jozef Chocholacek
  • 2,874
  • 2
  • 20
  • 25
  • Thank you, babun works very well. It is super simple to install and it contains a long list of useful unix command (and more). All this at the cost of downloading about 270MB. – Luca Jun 23 '15 at 14:52
  • @LucaNaso You're welcome. And I agree, Babun is a great tool, I install it as a first thing on every Windows machine I have to tackle with. :-) – Jozef Chocholacek Jun 23 '15 at 15:03