0

Using Python 2.7 to open up file a turn it into a dictionary with row 0 as key and row 3 over as values. Then open up file b, for the matching ID number in row 0 to append the other columns for the match. Then convert back into a list and merge two CSV's.

At the end add headers and save as "output.csv."

And example of what I was expecting:

File a:

1234, 14/12/2,123.4,
5618, 13/1/12,23.4,
9143, 15/2/14,222.4,

File b:

1234,abc,cda
9143,nda, bad 
5618, ede, fpc

(dream) output:

payment date, payment amount, other id, other other id, payment status, payment type
 14/12/2,123.4,1234,abc,cda, Received, Recurring
 13/1/12,23.4,9143,nda, bad, Received, Recurring 
 15/2/14,222.4,5618, ede, fpc,Received, Recurring

(actual) output:

ID, payment date, payment amount, other id, other other id, payment status, payment type
1234, 14/12/2,123.4,1234,abc,cda, Received, Recurring

 5618, 13/1/12,23.4,9143,nda, bad, Received, Recurring 

 9143, 15/2/14,222.4,5618, ede, fpc,Received, Recurring

The code:

import csv

#create a dict from first csv, with clearing solution id as key
with open("DDS.csv", "rb") as f:
    first = {rows[0]: rows[3:] for rows in list(csv.reader(f))}

# compare second csv, append rank, add received recurring columns
with open("report.csv", "rb") as f:
    for row in csv.reader(f):
        if row and row[0] in first:  # row[0] = clearing solution id
            first[row[0]].append(row[1])  # row[1] = rank
            first[row[0]].append(row[2])
            first[row[0]].append('Received')
            first[row[0]].append('Recurring')


# convert dict back to list
merged = [(k,) + tuple(v) for k, v in first.items()]

# write list to output csv
with open('output.csv', "w") as f:
    writer = csv.DictWriter(f, fieldnames =['ID', 'Payment Date', 'Payment Amount', 'Other ID','other other ID', 'Payment Status', 'Payment Type'])
    writer.writeheader()
    csv.writer(f).writerows(merged)

Bonus points: How do I delete the first column from the output CSV?

Thanks

Marius
  • 58,213
  • 16
  • 107
  • 105
  • The immediate issue you're seeing with the extra blank lines should be solved by adding a `lineterminator='\n'` argument to `csv.DictWriter()`, on Windows the csv module uses a lineterminator of `\r\n` by default, see similar questions e.g [here](http://stackoverflow.com/q/19618856/1222578) – Marius Apr 14 '15 at 00:30

2 Answers2

0

You can get rid of the first just don't add (k,) to your tuple and remove 'ID' from your fieldnames. You don't need to be creating another writer to write the rows csv.writer(f).writerows(merged)?:

merged = [tuple(v) for k, v in first.items()]
with open('output.csv', "w") as f:
    writer = csv.DictWriter(f, fieldnames =['Payment Date', 'Payment Amount', 'Other ID','other other ID', 'Payment Status', 'Payment Type'])
    writer.writeheader()
    writer.writerows(merged)
AChampion
  • 29,683
  • 4
  • 59
  • 75
-1

I suggest using classical python string manipulation instead of csv module.

For example using rows.replace(' ','').split(,)[0] instead of rows[0] should solve the space problem.

Ant
  • 134
  • 4