Split csv file vertically using command line

Question

Is it possible to split a csv file, vertically, into multiple files? I know we can split single large files into smaller files with no of rows mentioned using the command line. I have csv files in which columns are repeating after certain column no and I want to split that file column-wise.Is that possible with the command line, If not then how can we do it with python?

For Eg.

consider above sample in which site and address present multiple times vertically, I want to create 3 different csv files containing single site and single address

Any help would be highly appreciated,

Thanks

you might be interested in taking a look at the `cut` command — Moinuddin Quadri, Jan 07 '18 at 07:56
indeed [cut](https://en.wikipedia.org/wiki/Cut_(Unix)#Examples) is your friend. Please provide an actual input example if you want an adapted solution — ishahak, Jan 07 '18 at 08:24
@ishahak, I just edited the question, So basically I want to convert such files to multiple files and each file should contain only 2 rows i.e. site and address — Amit Sharma, Jan 07 '18 at 08:49
[Split dataframe into multiple output files](https://stackoverflow.com/questions/10002021/split-dataframe-into-multiple-output-files) — , Jan 07 '18 at 10:47
If any of the answers helped you, please accept it. It is a good habit... — ishahak, Jan 08 '18 at 03:43

score 3 · Accepted Answer · answered Jan 07 '18 at 11:11

Assuming your input files is named ~/Downloads/sites.csv and looks like this:

Google,google.com,Google,google.com,Google,google.com
MS,microsoft.com,MS,microsoft.com,MS,microsoft.com
Apple,apple.com,Apple,apple.com,Apple,apple.com

You can use cut to create 3 files, each containing one pair of company/site:

cut -d "," -f 1-2 < ~/Downloads/sites.csv > file1.csv
cut -d "," -f 3-4 < ~/Downloads/sites.csv > file2.csv
cut -d "," -f 5-6 < ~/Downloads/sites.csv > file3.csv

Explanation:

For the cut command, we declare the comma (,) as a separator, which splits every line into a set for 'fields'.

We then specify for each output file, which fields we want to be included.

HTH!

tshiono · Answer 2 · 2018-01-07T11:34:00.357

0

If the site-address pairs are regularly repeated, how about:

awk '{
    n = split($0, ary, ",");
    for (i = 1; i <= n; i += 2) {
        j = (i + 1) / 2;
        print ary[i] "," ary[i+1] >> "file" j ".csv";
    }
}' input.csv

edited Jan 07 '18 at 11:34

answered Jan 07 '18 at 11:27

tshiono

21,248
2
14
22

score 0 · Answer 3 · answered Jan 07 '18 at 14:06

The following script produces what you want (based on the SO answer adjusted for your needs: number of columns, field separator). It splits the original file vertically into 2 column chunks (note n=2) and creates 3 different files (tmp.examples.1, tmp.examples.2, tmp.examples.3 or whatever you specify for the f variable):

awk -F "," -v f="tmp.examples" '{for (i=1; i<=NF; i++) printf (i%n==0||i==NF)?$i RS:$i FS > f "." int((i-1)/n+1) }' n=2 example.txt

If your example.txt file has the subsequent data:

site,address,site,address,site,address
Google,google.com,MS,microsoft.com,Apple,apple.com

Split csv file vertically using command line

3 Answers3