3

I am attempting to merge several rows of csv data into one long row, given two cells contain the same data. For instance, take the following csv:

one, two, three
1, 2, 3
4, 5, 6
7, 8, 9
1, 1, 1
4, 4, 4

If two rows share the same value at row[0], I want the second row appended to the first. So my end product should look like this:

one, two, three
1, 2, 3, 1, 1, 1
4, 5, 6, 4, 4, 4
7, 8, 9

Here is my attempt so far:

import csv

uniqueNum = []
uniqueMaster = []
count = -1
with open("Test.csv", "rb") as source:
    reader = csv.reader(source)
    header = next(reader)
    for row in reader:
        if row[0] not in uniqueNum:
            uniqueMaster.append(row)
            uniqueNum.append(row[0])
            count = count + 1
            for row in reader:
                if row[0] in uniqueNum:
                    uniqueMaster[count].append(row)

with open("holding.csv","wb") as result:
    writer = csv.writer(result)
    writer.writerow(header)
    for row in uniqueMaster:
        writer.writerow(row) 

Things LOOK ok to me, but my script only outputs the following:

one, two, three
1, 2, 3, ['1', '1', '1']

This is obviously wrong for two reasons. First, it doesn't iterate through the entire csv, and second, the appended values are being squeezed into one cell, rather than individual cells. If anyone had any advice on getting this to work right I'd highly appreciate it!

bumble
  • 45
  • 1
  • 6

2 Answers2

4

Use a dictionary instead. Starting from the middle of your code(assume I have declared a dict called my_dict):

 for row in reader:
    if row[0] in my_dict.keys():
       my_dict[row[0]].extend(row)
    else:
       my_dict[row[0]]=row
  #...now we are at the bottom of your code, writing to the csv
 for v in my_dict.values():
    writer.writerow(v)
Albert Rothman
  • 998
  • 2
  • 9
  • 27
  • Thanks for the response. This nearly does it - the rows fall into the right spots, but they're not formatted properly. For example, the first key/value pairs are: `{'1': ['1', '2', '3', ['1', '1', '1']]}` when they should look like this: `{'1': ['1', '2', '3', '1', '1', '1']}`. Any advice on fixing this last part? – bumble Nov 16 '16 at 15:34
  • Fixed it. It should use extend() instead of append(). For more information see this [post](http://stackoverflow.com/questions/252703/append-vs-extend). Specifically Aaron Hall gives a great explanation with runtimes included. – Albert Rothman Nov 16 '16 at 18:12
  • That did it! Thanks for the crash course in dictionaries! – bumble Nov 16 '16 at 21:17
1
import csv

csv_dict = {}

with open("Test.csv", "r") as source:
    reader = csv.reader(source)
    header = next(reader)

    for row in reader:
        if row[0] in csv_dict:
            csv_dict[row[0]] += row
        else:
            csv_dict[row[0]] = row
Joshua N
  • 11
  • 2