How to sort a file alphabetically by named column, python, csv

Question

I have three csv files each with three named columns, 'Genus', 'Species', and 'Source'. I merged the files into a new document and now I need to alphabetize the columns, first by genus and then by species. I figured I could do this by first alphabetizing the species, and then the genus and then they should be in the proper order, but I haven't been able to find anything online that addresses how to sort named columns of strings. I tried lots of different ways of sorting, but it either didn't change anything or replaced all the string in the first column with the last string.

Here's my code for merging the files:

import csv, sys

with open('Footit_aphid_list_mod.csv', 'r') as inny:
    reader = csv.DictReader(inny)

    with open('Favret_aphid_list_mod.csv', 'r') as inny:
        reader1 = csv.DictReader(inny)

        with open ('output_al_vonDohlen.csv', 'r') as inny:
            reader2 = csv.DictReader(inny)

            with open('aphid_list_complete.csv', 'w') as outty:
                fieldnames = ['Genus', 'Species', 'Source']
                writer = csv.DictWriter(outty, fieldnames = fieldnames)
                writer.writeheader() 

                for record in reader:
                    writer.writerow(record)
                for record in reader1:
                    writer.writerow(record)
                for record in reader2:
                    writer.writerow(record)

                for record in reader:
                    g = record['Genus']
                    g = sorted(g)
                    writer.writerow(record)

inny.closed
outty.closed

first store all the data in a list of rows _then_ sort, _then_ write back to file. — Jean-François Fabre, Nov 21 '17 at 21:35
you may find this page useful: https://stackoverflow.com/questions/4233476/sort-a-list-by-multiple-attributes — pault, Nov 21 '17 at 21:39

Mark Tolonen · Accepted Answer · 2017-11-22T14:55:35.900

2

If you files aren't insanely large, then read all the rows into a single list, sort it, then write it back:

#!python2
import csv

rows = []

with open('Footit_aphid_list_mod.csv','rb') as inny:
    reader = csv.DictReader(inny)
    rows.extend(reader)

with open('Favret_aphid_list_mod.csv','rb') as inny:
    reader = csv.DictReader(inny)
    rows.extend(reader)

with open('output_al_vonDohlen.csv','rb') as inny:
    reader = csv.DictReader(inny)
    rows.extend(reader)

rows.sort(key=lambda d: (d['Genus'],d['Species']))

with open('aphid_list_complete.csv','wb') as outty:
    fieldnames = ['Genus','Species','Source']
    writer = csv.DictWriter(outty,fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(rows)

edited Nov 22 '17 at 14:55

answered Nov 22 '17 at 07:38

Mark Tolonen

166,664
26
169
251

This worked! The only thing is that because I'm using 2.7, I had to remove all the 'newline=' attributes from 'open'- but everything was just fine without them. – birdoptera Nov 22 '17 at 14:48
@birdoptera Updated. Note use of binary mode instead of `newline=''` for Python 2 per csv documentation. – Mark Tolonen Nov 22 '17 at 14:59

How to sort a file alphabetically by named column, python, csv

1 Answers1