How to split a file with 250k columns vertically?

Question

I need to split a file with 250k based on the size (preferable) or number of columns to several (~5) chunks. I am aware of split command for row-wise splitting, but don't know if there is any similar function to split column-wise. The number of columns in my file are not even, so the chunks cannot have equal number of columns.

Input:

AA BB CC DD EE FF GG HH II JJ KK LL MM
NN OO PP QQ RR SS TT UU VV WW XX YY ZZ

Desired output:

File1

AA BB CC DD        
NN OO PP QQ

File2

EE FF GG HH
RR SS TT UU

File3

II JJ KK LL MM
VV WW XX YY ZZ

What about scripting [cut](http://linux.die.net/man/1/cut)? – n0741337 Feb 18 '14 at 23:46 — n0741337, Feb 18 '14 at 23:46

BMW · Accepted Answer · 2014-02-19T09:07:08.703

4

Using awk, you can adjust n to the number you expect.

awk '{for (i=1;i<=NF;i++)
         printf (i%n==0||i==NF)?$i RS:$i FS > "File" int((i-1)/n+1) ".txt"
      }' n=5 file

edited Feb 19 '14 at 09:07

answered Feb 19 '14 at 04:05

BMW

42,880
12
99
116

1

OP stated `The number of columns in my file are not even` – jaypal singh Feb 19 '14 at 06:02

score 2 · Answer 2 · edited Feb 19 '14 at 04:23

2

Use cut. It's part of GNU coreutils.

Assuming your input file columns are separated by a space:

cut -d " " -f1-4 /path/to/input/file > file1

cut -d " " -f5-8 /path/to/input/file > file2

...

See the man page man cut for more information.

edited Feb 19 '14 at 04:23

mklement0

382,024
64
607
775

answered Feb 18 '14 at 23:46

Ken

7,847
1
21
20

score 2 · Answer 3 · answered Feb 19 '14 at 03:02

I would use awk for this. Not sure if you would like to have 5 columns per file as you mentioned that you have 250k columns that would make creating 50k files, but here is something to get you started:

awk '{
  y=1
  for(i=1;i<NF;i++) { 
    if(i%5==0) {
      print $i > "text"y".txt"
      y+=1
      continue 
    }
  printf "%s ",$i >"text"y".txt"
  } 
print $NF > "text"y".txt"}' file

Test:

$ cat file
AA BB CC DD EE FF GG HH II JJ KK LL MM
NN OO PP QQ RR SS TT UU VV WW XX YY ZZ

$ awk '{
  y=1
  for(i=1;i<NF;i++) { 
    if(i%5==0) {
      print $i > "text"y".txt"
      y+=1
      continue 
    }
  printf "%s ",$i >"text"y".txt"
  } 
print $NF > "text"y".txt"}' file

$ head text*
==> text1.txt <==
AA BB CC DD EE
NN OO PP QQ RR

==> text2.txt <==
FF GG HH II JJ
SS TT UU VV WW

==> text3.txt <==
KK LL MM
XX YY ZZ

How to split a file with 250k columns vertically?

3 Answers3

Test:

Linked