-5

I have a large csv file in the format e.g.

POBOX,DEPT,ORGNAME,SUBBNAME,BUILDING,BNUM,STREET,LOCALITY,TOWN,COUNTY,POSTCODE,ID
,,,,,20,A STREET,,EXETER,DEVON,EX1 2SP,10013036001
,,,FLAT 0,,100,ALPHO ROAD,,EXETER,DEVON,EX2 8JD,10013036004
,,,JANB,,100,ALPHO ROAD,,EXETER,DEVON,EX2 8HZ,10013036003
,,,,,2A,BLACKHALL ROAD,,EXETER,DEVON,EX4 4HE,
,,,,,300,FOR STREET,HEAVITREE,EXETER,DEVON,EX1 2QN,

I have manually sorted this file by the column ID (the last column) in Excel and saved it. I would like to write a python method to open the file and sort it by the column id and then write it back to the file, save it. Is this possible?

Riaz

Riaz Ladhani
  • 3,946
  • 15
  • 70
  • 127
  • How're you reading in the data? If you're reading it in as a two dimensional array: https://stackoverflow.com/questions/20099669/python-sort-multidimensional-array-based-on-2nd-element-of-subarray – Stiffo Aug 06 '15 at 12:50
  • The file is being read from a method, my colleague has written some code to compare the data between 2 files. The method is def diff_reports(): c = FileComparator([r"C:\QA\output\Excel sorted\file11.csv", r"C:\QA\output\Excel sorted\file12.csv"], 11, r"C:\QA\gazdb ABP output\Excel sorted\errs_5.csv") c.compare(0) – Riaz Ladhani Aug 06 '15 at 12:54
  • I would like to have the file sorted by ID column before i call the method which compares the data – Riaz Ladhani Aug 06 '15 at 12:54
  • 1
    Yes, it's possible. What have you tried? – martineau Aug 06 '15 at 12:58
  • I am reading up on this to try and do it https://wiki.python.org/moin/HowTo/Sorting – Riaz Ladhani Aug 06 '15 at 13:01

1 Answers1

2
import csv
array = csv.reader(open('1.csv','r'))
a = list(array)[:] # for a[0] - save first line
s_array = [a[0]] + sorted(a[1:], key=lambda x: x[-1])
csv.writer(open('2.csv', 'wb'), dialect='excel').writerows(s_array)

special thanks to:

Python tuples sorting based on last element

How to write a tuple of tuples to a CSV file using Python

Converting a csv file into a list of tuples with python

Community
  • 1
  • 1
Alexey Astahov
  • 203
  • 1
  • 7
  • Thanks for the suggestion. It is sorting the files but the records that don't have a value in the ID column are appearing first. I would like the records that do have an ID value to appear first and in sorted order – Riaz Ladhani Aug 06 '15 at 13:44
  • E.g. record without an id value: ,,,,,201,BLACKHALL ROAD,,EXETER,DEVON,EX4 4HE, – Riaz Ladhani Aug 06 '15 at 13:44
  • E.g. record with an id value: ,,,,,28,ALPHA STREET,,EXETER,DEVON,EX1 2SP,10013036001 – Riaz Ladhani Aug 06 '15 at 13:44
  • I think I have got it. I changed key=lambda x: x[-1]) to key=lambda x: x[1]) – Riaz Ladhani Aug 06 '15 at 13:47
  • I got excited to quick. It hasn't sorted it for the rows which has an ID value to appear at the top. The records with a blank ID are appearing at the top. I would like it for the records that do have an ID value to appear at the top – Riaz Ladhani Aug 06 '15 at 14:06
  • try this hack: s_array = [a[0]] + sorted(a[1:], key=lambda x: x[-1] if x[-1] else '~'). Because '~' is penultimate symbol in ascii table – Alexey Astahov Aug 06 '15 at 15:01