Transpose a large tab delimited file in python

Question

I am trying to transpose a huge tab delimited file with about 6000 rows and 2 million columns. The preferable approach should not involving holding the whole file in memory, which seems to be what the answer in this question does:

How to do row-to-column transposition of data in csv table?

Are the columns fixed width, or do they all have different widths? — Sven Marnach, Jun 18 '13 at 10:20
Unfortunately the first two columns are different from the others, they are text strings with different widths, but the other columns are all numbers with fixed widths. — qed, Jun 18 '13 at 10:55
But these two columns are not of much importance and can be removed if necessary. — qed, Jun 18 '13 at 11:10
I just left an answer to a question identical to your here: http://stackoverflow.com/questions/7156539/how-do-i-transpose-pivot-a-csv-file-with-python-without-loading-the-whole-file/26122437#26122437 — tommy.carstensen, Sep 30 '14 at 13:46

score 0 · Answer 1 · answered Jun 18 '13 at 10:27

One approach would be to iterate over the input file once for every column (untested code!):

with open("input") as f, open("output", "w") as g:
    try:
        for column_index in itertools.count():
            f.seek(0)
            col = [line.split("\t")[column_index] for line in f]
            g.write("\t".join(col) + "\n")
    except IndexError:
        pass

This is going to be very slow, but only keeps a single line at a time in memory.

Transpose a large tab delimited file in python

1 Answers1