1

I have 4 csv files and I would like to combine them into single csv file using python language. For example,

csv1

id name age

csv2

id degree stage

csv3

id color car

csv4

id tv game

I would like to combine them into the csv file:

id name age degree stage color car tv game

This is my simple code but I doest work. Help please

import csv

reader = csv.reader(open('output1.csv', 'rb'))
reader1 = csv.reader(open('output2.csv', 'rb'))
reader2 = csv.reader(open('output3.csv', 'rb'))
reader3 = csv.reader(open('output4.csv', 'rb'))
writer = csv.writer(open('appended_output.csv', 'wb'))
for row in reader:
    row1 = reader1.next()
    row2 = reader2.next()
    row3 = reader3.next()
    writer.writerow(row + row1 + row2 + row3)
skysky2000
  • 73
  • 7

1 Answers1

1

csv.reader returns an iterable, so you should be using next(readerx) instead of readerx.next

for row in reader:
    row1 = next(reader1)
    row2 = next(reader2)
    row3 = next(reader3)
    writer.writerow(row + row1 + row2 + row3)

Or you could use pandas like suggested here

except that you would use pandas.read_csv and DataFrame.to_csv, to use CSV rather than excel format.

Handling tables with different sizes

One limitation with the above approach is that since the rows are concatenated individually, if a table has rows of different lengths, it will mess the columns to the right. Other limitation is that if the table has different numbers of rows there it will fail if any table is shorter than output1.csv if any is longer than output1.csv some rows will be lost. The code below address these limitations by filling emtpy cells. The limitation of this approach is that the tables must be loaded to the memory.

def make_uniform_rows(table, numrows, empty=''):
    '''
      Given a table as a list of lists, produces an output
      ensuring that all lists have the same length
      filling the empty cells
    '''
    out = []
    ncols = max(len(row) for row in table)
    for i in range(table, numrows):
        row = table[i] if i < len(table) else []
        row += [empty] * (ncols - len(row))
        out.append(row);
    return out;

def concatenate_tables(readers):
    # load everything to memory
    tables = [list(reader) for reader in readers];
    # determine the number of rows of the bigger csv
    numrows = max(len(table) for table in tables)
    
    return make_uniform_rows(tables, numrows)
    
    
reader0 = csv.reader(open('output1.csv', 'rb'))
reader1 = csv.reader(open('output2.csv', 'rb'))
reader2 = csv.reader(open('output3.csv', 'rb'))
reader3 = csv.reader(open('output4.csv', 'rb'))
writer = csv.writer(open('appended_output.csv', 'wb'))


for row in concatenated_tables([reader0, reader1, reader2, reader3]):
    writer.writerow(row + row1 + row2 + row3)
Bob
  • 13,867
  • 1
  • 5
  • 27