python update a column value of a csv file according to another csv file

Question

For example consider the files a.csv

1  23 'better' 0
2  21 'bad'    0
3  34 'good'   0

and b.csv

23 2
34 5

We can get the result:

1  23 'better' 2
2  21 'bad'    0
3  34 'good'   5

I want to update a column value of a csv according to other csv file. Do you have a good idea? — Huan Ren, Apr 02 '15 at 07:45
read 1st file as a list of lists, read second file as a dictionary keyed by the 1st column, output the list of llists checking if the second item is present in the dictionary and updating the last value accordingly. If you show us your code I can help fixing it. — gboffi, Apr 02 '15 at 07:50
with open('a.csv', 'r') as input1, open(b.csv', 'r') as input2: with open('c.csv', 'w') as output: r1 = csv.reader(input1) r2 = csv.reader(input2) w = csv.writer(output) for row1 in r1: for row2 in r2: if row1[2] == row2[0]: row1[7] = '1' w.writerow(row1) — Huan Ren, Apr 02 '15 at 07:53
This is a SQL join on your index column. pandas can do it. Also the `read_csv` — smci, Apr 02 '15 at 09:18

Marek · Accepted Answer · 2015-04-02T08:23:46.840

Thanks for making the question clearer. This code does not modify file A inplace and instead it uses output file fileC.

import csv #imports module csv

filea = "fileA.csv"
fileb = "fileB.csv"
output = "fileC.csv"

delim = ";" #set your own delimiter

source1 = csv.reader(open(filea,"r"),delimiter=delim)
source2 = csv.reader(open(fileb,"r"),delimiter=delim)
#open csv readers

source2_dict = {}

# prepare changes from file B
for row in source2:
    source2_dict[row[0]] = row[1]

# write new changed rows
with open(output, "w") as fout:
    csvwriter = csv.writer(fout, delimiter=delim)
    for row in source1:
        # needs to check whether there are any changes prepared
        if row[1] in source2_dict:
            # change the item
            row[3] = source2_dict[row[1]]
        csvwriter.writerow(row)

I hope I understood your intention well.

Just a short explanation of the steps:

First you specify the paths to source files and an output file and you also specify the delimiter.
Then you load CSV readers using csv module.
You read all the changes from source file B and store it in a
dictionary.
And then you iterate through file A, modify the row when necessary and then you save it to output file.

It's work! Thank you ! But I want to know the reason that why does my code don't work? Just because that two loops? — Huan Ren, Apr 02 '15 at 09:08

gboffi · Answer 2 · 2015-04-03T13:56:46.603

read a.csv

a_content = [l.split() for l in open('a.csv')]

read b.csv

b_dict = {k:v for k,v in [l.split() for l in open('b.csv')]}

output the modified data

for rec in a_content:
    if rec[1] in b_dict:
       rec[3] = b_dict[rec[1]]
    outfile.write(" ".join(rec)+'\n')

Addendum

Thinking about it, a single print statement will do

print "\n".join(" ".join(i[:-1]+([b[i[1]]]if i[1]in b else[i[-1]]))for b in[{k:v for k,v in[l.split()for l in open('b.csv')]}]for i in[l.split()for l in open('a.csv')])

python update a column value of a csv file according to another csv file

2 Answers2

Linked