0

I have a single huge csv file that contains header line and than hundreds of thousands of records.

I want to split it into multiple files, each containing the same header and than 10.000 records or what's left

If I didn't care about the header, I'd do split -l 10000 myfile. However, I need each file to contain the header

How do I do this?

Arsen Zahray
  • 24,367
  • 48
  • 131
  • 224

1 Answers1

3

Split the file, exluding the header:

tail -n +2 myfile | split -l 10000 - prefix-

Get the header line:

head -1 myfile > header

And then append it to all the generated files:

for file in prefix-*; do
   cat header $file > $file.new
   mv $file.new $file
done
larsks
  • 277,717
  • 41
  • 399
  • 399
  • thanks! what does the `+2` in the `tail` stand for? I see that it skips the first line, but why does it do it? – Arsen Zahray Apr 10 '21 at 19:00
  • 1
    Google "man tail" to get the documentation (MANual) for the *tail* utility. You'll find that `-n +2` tells *tail* to start with the second line, instead of listing the last 10 lines (default) – phunsoft Apr 10 '21 at 19:20