How to append a new list to an existing CSV file?

Question

I already have a CSV file created from a list using CSV writer. I want to append another list created through a for loop columnwise to a CSV file.

The first code to create a CSV file is as follows:

with open("output.csv", "wb") as f:
    writer = csv.writer(f)
    for row in zip(master_lst):
        writer.writerow(row)

I created the CSV file using the list master_lst and the output is as follows:

read
ACACCUGGGCUCUCCGGGUACC
ACGGCUACCUUCACUGCCACCC
AGGCAGUGUGGUUAGCUGGUUG

Then I create another list (ind_lst) through a for loop and the contents of the list has to be appended columnwise to the CSV file created in the previous step. I used the following code:

with open("output.csv", "ab") as f:
    writer = csv.writer(f)
    for row in zip(ind_lst):
        writer.writerow(row)

The output I obtained is as follows:

read
ACACCUGGGCUCUCCGGGUACC
ACGGCUACCUUCACUGCCACCC
AGGCAGUGUGGUUAGCUGGUUG
sample1
3
3
1
sample2
4
4
1

However I need the output columnwise as follows:

read                         sample1     sample2
ACACCUGGGCUCUCCGGGUACC         3            4
ACGGCUACCUUCACUGCCACCC         3            4
AGGCAGUGUGGUUAGCUGGUUG         1            1

I checked for solutions but I can find only solutions for appending row wise, but I need to append it columnwise: append new row to old csv file python

I used writer.writerows instead of writer.writerow but I get this error:

_csv.Error: sequence expected

The output is as follow:

read
ACACCUGGGCUCUCCGGGUACC
ACGGCUACCUUCACUGCCACCC
AGGCAGUGUGGUUAGCUGGUUG
s                        a   m   p  l  e 1

As you can see, it prints the first element of the list in each cell and terminates thereafter with an error. I am a beginner in python, so if anyone could help solve this issue that would be awesome.

EDIT:

The master_lst is created using the following code:

 infile= open(sys.argv[1], "r")
 lines = infile.readlines()[1:]
 master_lst = ["read"]
 for line in lines:
  line= line.strip().split(',')
  fourth_field = line [3]
  master_lst.append(fourth_field)

the ind_lst is created using the following code:

for file in files:
 ind_lst = []   
 if file.endswith('.fa'):
  first = file.split(".")
  first_field = first [0]
  ind_lst.append(first_field)
  fasta= open(file)
  individual_dict= {}
  for line in fasta:
   line= line.strip()
   if line == '':
    continue
   if line.startswith('>'):
    header = line.lstrip('>')
    individual_dict[header]= ''
   else:
    individual_dict[header] += line
 for i in master_lst[1:]:
   a = 0
   if key in individual_dict.keys():
     a = individual_dict[key]
   else:
    a = 0
   ind_lst.append(a)

Please [edit] your question and add code defining the `master_iist` and, more importantly, the `for` loop that creates the `ind_lst`. — martineau, Feb 12 '17 at 15:43
@martineau: Hi martineau, i have updated the info that you had requested — rex, Feb 12 '17 at 19:05
@martineau: I was looking in the answers section and missed ur above comment. And sorry that i didn't explain my issue clearly :( the ind_lst is created for each file with the file extension .fa in the current working directory and the contents of ind_lst has to appended columnwise to the output CSV file. so all the .fa files are read, ind_lst are created and appended columnwise. In the example i have provided there 2 .fa files used to create ind_lst through for loop which has to be appended columnwise. Please let me know if you have any questions and if my question is still not clear — rex, Feb 12 '17 at 19:23
rex: Good move. However I got tired of waiting for you to do it and answered your question as best I could from what was in it at that time. You could probably slightly simplify what's in my answer—the part that rearranges the `ind_lst`—by creating the `ind_lst` so that it was a list of lists, with the nth element in each sublist in it coming from each of the fasta files. If you can't figure-out how to do that, time to ask another question. — martineau, Feb 12 '17 at 19:26
rex: Thanks. [My answer](http://stackoverflow.com/questions/21680473/how-can-i-open-multiple-files-files-unknown-beforehand-using-with-open-stat) to an unrelated question might help you create the `ind_lst` from multiple fasta files all open and being read at the same time (hint-hint). — martineau, Feb 12 '17 at 19:35
@martineau When i use smaller files, i can create the read counts in a reasonable time. But when the size of the master_lst increases, then the execution time slows down drastically. Could you please suggest a solution to overcome this issue. I don't know hot to use multiprocessing module. I have posted a question regarding this here : http://stackoverflow.com/questions/42602306/how-to-use-multiprocessing-module-to-iterate-a-list-and-match-it-with-a-key-in-d Could you please suggest a solution? — rex, Mar 05 '17 at 12:06
Yes. See the [latest revision](http://stackoverflow.com/a/42613205/355230) to my answer to your [follow-on question](http://stackoverflow.com/questions/42602306/how-to-use-multiprocessing-module-to-iterate-a-list-and-match-it-with-a-key-in-d). — martineau, Mar 10 '17 at 21:55

score 1 · Answer 1 · answered Feb 12 '17 at 14:11

1

You need to combine the data from read, sample1 and sample2 into a row.

Assuming those are iterables, you could use zip to combine them:

for row in zip(read, sample1, sample2):
    writer.writerow(row)

answered Feb 12 '17 at 14:11

Roland Smith

42,427
3
64
94

sample 1 and sample 2 are not iterable i guess. Sample 1 and sample 2 are lists created through for loop. the list ind_lst should become empty after each for loop in order to accommodate the new list to be created and appended. Is there any other solution to this issue? – rex Feb 12 '17 at 14:38
@rex No. You will *have* to make an iterable for a row if you want multiple columns. – Roland Smith Feb 12 '17 at 15:45

martineau · Accepted Answer · 2017-02-12T18:35:05.323

You're actually trying to append several columns to the existing file, even if the data for these new columns is all stored in a single list. It might be better to arrange the data in the ind_lst differently. but since you haven't showed how that's done, the code below works with the format in your question.

Since modifying CSV files is tricky—since they're really just text file—it would be much easier to simply create a new file with the merged data, and then rename that file to match the original after deleting the original (you've now been warned).

import csv
from itertools import izip  # Python 2
import os
import tempfile

master_lst = [
    'read',
    'ACACCUGGGCUCUCCGGGUACC',
    'ACGGCUACCUUCACUGCCACCC',
    'AGGCAGUGUGGUUAGCUGGUUG'
]

ind_lst = [
    'sample1',
    '3',
    '3',
    '1',
    'sample2',
    '4',
    '4',
    '1'
]

csv_filename = 'output.csv'

def grouper(n, iterable):
    's -> (s0,s1,...sn-1), (sn,sn+1,...s2n-1), (s2n,s2n+1,...s3n-1), ...'
    return izip(*[iter(iterable)]*n)

# first create file to update
with open(csv_filename, 'wb') as f:
    writer = csv.writer(f)
    writer.writerows(((row,) for row in master_lst))

# Rearrange ind_lst so it's a list of pairs of values.
# The number of resulting pairs should be equal to length of the master_lst.
# Result for example data:  [('sample1', 'sample2'), ('3', '4'), ('3', '4'), ('1', '1')]
new_cols = (zip(*grouper(len(master_lst), ind_lst)))
assert len(new_cols) == len(master_lst)

with open(csv_filename, 'rb') as fin, tempfile.NamedTemporaryFile('r+b') as temp_file:
    reader = csv.reader(fin)
    writer = csv.writer(temp_file)
    nc = iter(new_cols)
    for row in reader:
        row.extend(next(nc))  # add new columns to each row
        writer.writerow(row)
    else:  # for loop completed, replace original file with temp file
        fin.close()
        os.remove(csv_filename)
        temp_file.flush()  # flush the internal file buffer
        os.fsync(temp_file.fileno())  # force writing of all data in temp file to disk
        os.rename(temp_file.name, csv_filename)

print('done')

Contents of file after creation followed by update:

read,sample1,sample2
ACACCUGGGCUCUCCGGGUACC,3,4
ACGGCUACCUUCACUGCCACCC,3,4
AGGCAGUGUGGUUAGCUGGUUG,1,1

How to append a new list to an existing CSV file?

2 Answers2

Linked