1

In my program, Python opens a CSV Excel file and reads it. The file contains alphanumeric data, separated into three different cells in the Excel file. The program then makes the data a list.

The program should sort the list by the numerical data, highest to lowest. But I have been not able to find a solution for this.

So far I have:

def highlow ():
    file = open("Thisfile.csv", "r" )
    read = file.readlines ()
    list(read)

I've tried:

read.sort ()

But when I print 'read' all I get is:

['ANDERSON,Jane,7\n', 'BIRCH,Darren,9\n', 'MCKOLEEP,Philip,6\n', 'SMITH,Richard,3\n']

Instead I wanted the list sorted by the integer in each item, instead of alphabetically.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
W Racoon
  • 9
  • 4

3 Answers3

4

Start by reading the CSV with the csv module, which will give you properly separated values:

import csv

with open("Thisfile.csv", "rb") as file:
    reader = csv.reader(file)
    rows = [r[:-1] + [int(r[-1])] for r in reader]

rows.sort(key=lambda r: r[-1])

The rows = [...] line uses a list comprehension to convert each last column to an integer object as the CSV file is read.

With the rows as separate lists, sorting then becomes relatively trivial; just sort or that last column, which is what the key=lambda r: r[-1] bit does.

Demo:

>>> import csv
>>> demodata = '''\
... ANDERSON,Jane,7
... BIRCH,Darren,9
... MCKOLEEP,Philip,6
... SMITH,Richard,3
... '''
>>> reader = csv.reader(demodata.splitlines(True))
>>> rows = [r[:-1] + [int(r[-1])] for r in reader]
>>> rows.sort(key=lambda r: r[-1])
>>> for row in rows:
...     print row
... 
['SMITH', 'Richard', 3]
['MCKOLEEP', 'Philip', 6]
['ANDERSON', 'Jane', 7]
['BIRCH', 'Darren', 9]
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
1

You can use sth like that :

read = sorted(read, key=lambda data: data.replace('\n', '').split(',', 3)[2])
Yurrili
  • 338
  • 1
  • 11
1

You can try using operator.itemgetter(), which makes it very intuitive.

import csv
from operator import itemgetter

rows = []

with open("Thisfile.csv", "rb") as f:
    csv_reader = csv.reader(f1)
    for row in csv_reader:
        rows.append(row)

# This sorts by column #2 in descending order:
sorted_rows = sorted(rows, key=itemgetter(2), reverse=True)

>>>
[['BIRCH', 'Darren', '9'],
 ['ANDERSON', 'Jane', '7'],
 ['MCKOLEEP', 'Philip', '6'],
 ['SMITH', 'Richard', '3']]

You can also sort the list in place by changing the last line to:

rows.sort(key=itemgetter(2), reverse=True)

Additional: You also have the convenience to do a secondary sort if you have duplicate numeric values in the last column, say you can then sort by the last name after sorting by the numeric value. eg. if you have the following rows in the csv:

ANDERSON,Jane,7
BIRCH,Darren,9
MCKOLEEP,Philip,6
SMITH,Richard,3
WELSH,John,3

You can then use the following line for sorting:

sorted_rows = sorted(rows, key=itemgetter(2,0), reverse=True)

This sorts first by column #2, then additionally by column #0, all in descending order, giving the following output:

[['BIRCH', 'Darren', '9'],
 ['ANDERSON', 'Jane', '7'],
 ['MCKOLEEP', 'Philip', '6'],
 ['WELSH', 'John', '3'],
 ['SMITH', 'Richard', '3']]
vk1011
  • 7,011
  • 6
  • 26
  • 42