0

I have a number of csv files which have variable length rows. For example The following:

Time,0,8,18,46,132,163,224,238,267,303
X,0,14,14,14,15,16,17,15,15,15
Time,0,4,13,22,32,41,50,59,69,78,87,97,106,115,125,127,137,146,155,165,174,183,192,202,211,220,230,239,248,258,267,277,289,298,308
Y,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Time,0,4,13,22,32,41,50,59,69,78,87,97,106,115,125,127,137,146,155,165,174,183,192,202,211,220,230,239,248,258,267,277,289,298,308
Z,0,1,2,1,1,1,1,1,1,2,2,1,0,1,1,2,2,2,2,2,1,1,2,2,2,1,1,1,1,1,2,2,2,2,2
Time,0,308
W,0,0

Becomes:

Time,X,Time,Y,Time,Z,Time,W
0,0,0,0,0,0,0,0
8,14,4,0,4,1,308,0

A lot of data has been lost, it only took the first 2 of each.

I want to transpose this CSV in python. I have the following program:

import csv
import os
from itertools import izip
import sys

try:
    filename = sys.argv[1]
except IndexError:
    print 'Please add a filename'
    exit(-1)
with open(os.path.splitext(filename)[0] + '_t.csv', 'wb') as outfile, open(filename, 'rb') as infile:
    a = izip(*csv.reader(infile))
    csv.writer(outfile).writerows(a)

However it seems to trim a lot of data because the file has dropped from 20KB to 6KB and only keeps up to the minimum row length.

Any ideas how to not drop any data?

jnd
  • 754
  • 9
  • 20
  • is the snippet a single csv file or three different csv files? and can you post the problematic output? – maxymoo Jul 21 '15 at 05:19
  • Ok I added more information and it's one CSV transposing. – jnd Jul 21 '15 at 05:27
  • 1
    What should the third row's `W` data be, and what the eleventh row's `X` data? With what values should the data be padded? – Finwood Jul 21 '15 at 05:29

2 Answers2

2

izip zips according to the shortest array , so you are getting only the values from each row for the length of the shortest array.

You should use izip_longest instead of that , it zips with the longest array, and it would put None where there are no values.

Example -

import csv
import os
from itertools import izip_longest
import sys

try:
    filename = sys.argv[1]
except IndexError:
    print 'Please add a filename'
    exit(-1)
with open(os.path.splitext(filename)[0] + '_t.csv', 'wb') as outfile, open(filename, 'rb') as infile:
    a = izip_longest(*csv.reader(infile))
    csv.writer(outfile).writerows(a)

Result I got from this -

Time,X,Time,Y,Time,Z,Time,W

0,0,0,0,0,0,0,0

8,14,4,0,4,1,308,0

18,14,13,1,13,2,,

46,14,22,1,22,1,,

132,15,32,1,32,1,,

163,16,41,1,41,1,,

224,17,50,1,50,1,,

238,15,59,1,59,1,,

267,15,69,1,69,1,,

303,15,78,1,78,2,,

,,87,1,87,2,,

,,97,1,97,1,,

,,106,1,106,0,,

,,115,1,115,1,,

,,125,1,125,1,,

,,127,1,127,2,,

,,137,1,137,2,,

,,146,1,146,2,,

,,155,1,155,2,,

,,165,1,165,2,,

,,174,1,174,1,,

,,183,1,183,1,,

,,192,1,192,2,,

,,202,1,202,2,,

,,211,1,211,2,,

,,220,1,220,1,,

,,230,1,230,1,,

,,239,1,239,1,,

,,248,1,248,1,,

,,258,1,258,1,,

,,267,1,267,2,,

,,277,1,277,2,,

,,289,1,289,2,,

,,298,1,298,2,,

,,308,1,308,2,,
Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
0

This is an approach without itertools.izip:

import csv

with open('transpose.csv') as infile, \
        open('out.csv', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    while True:
        try:
            index = next(reader)
            data = next(reader)
        except StopIteration:
            break
        writer.writerows(zip(index, data))

With your given input, this snippet produces the following out.csv:

Time,X
568,0
573,0
577,1
581,1
585,0
590,2
594,0
599,0
603,0
Time,Y
590,0
594,3
599,3
03,0
Time,Z
599,0
603,1

Is this what you desire?


Update

This modified example should match your updated question:

import csv
from itertools import zip_longest  # izip_longest in Python 2

with open('transpose.csv') as infile, \
        open('out.csv', 'w') as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)

    writer.writerows(zip_longest(*reader, fillvalue=0))

Update the fillvalue to what you'd like to replace missing values with.

Finwood
  • 3,829
  • 1
  • 19
  • 36