1

I am trying to read a CSV file into a list and then sort it based on the first two columns of the list (first by first column and then by second column if the first column is the same). This is what I am doing:

def sortcsvfiles(inputfilename,outputfilename):
    list1=[]
    row1=[]
    with open(inputfilename,'rt') as csvfile1:
        reader=csv.reader(csvfile1)
        cnt=0       
        for row in reader:
            if cnt==0:        #skip first row as it contains header information
                row1=row
                cnt+=1
                continue    
            list1.append((row)) 

        list1.sort(key=lambda ro: (int(ro[0]),int(ro[1])))

    list1.insert(0, row1)
    with open(outputfilename,'wt') as csvfile1:
        writer=csv.writer(csvfile1, lineterminator='\n')
        for row in list1:
            writer.writerow(row)

But I am getting the following error:

  File "C:\Users\50004182\Documents\temp.py", line 37, in <lambda>
    list1.sort(key=lambda ro: (int(ro[0]),int(ro[1])))
IndexError: list index out of range

How can I fix this?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Noober
  • 1,516
  • 4
  • 22
  • 48

2 Answers2

3

You have probably an empty line in your file. Perhaps the last one. For example, you can just ignore empty lines:

def sortcsvfiles(inputfilename,outputfilename):
    with open(inputfilename,'rt') as csvfile:
        reader = csv.reader(csvfile)
        header = next(reader)
        data = [row for row in reader if row] # ignore empty lines
        data.sort(key=lambda ro: (int(ro[0]),int(ro[1])))

    with open(outputfilename,'wt') as csvfile:
        writer=csv.writer(csvfile, lineterminator='\n')
        writer.writerow(header)
        writer.writerows(data)
Daniel
  • 42,087
  • 4
  • 55
  • 81
2

The error occurs because you have at least one row that does not have 2 columns. It may have 1 or even 0 instead.

You could test for this before appending the row:

if len(row) > 1:
    list1.append(row) 

To sort all rows but skip the first header, you can use the next() function (see a previous answer of mine); using the sorted() function perhaps:

def sortcsvfiles(inputfilename, outputfilename):
    with open(inputfilename,'rt') as csvfile1:
        reader = csv.reader(csvfile1)
        headers = next(reader, None)  # get one row, or None if there are no rows
        rows = sorted(
            (r for r in reader if len(r) > 1),
            key=lambda r: (int(r[0]), int(r[1])))

    with open(outputfilename,'wt') as csvfile1:
        writer = csv.writer(csvfile1, lineterminator='\n')
        if headers:
            writer.writerow(headers)
        writer.writerows(rows)

I used writer.writerows() to write the whole list of sorted rows in one call.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • I don't think there are any rows with length less than `2`. I tried your solution but still it gives the same error. – Noober Dec 25 '15 at 12:43
  • @Noober: that's the only reason the exception occurs; if `r[0]` or `r[1]` doesn't exist. Note that the traceback shows the exception occurs within the lambda. – Martijn Pieters Dec 25 '15 at 12:44
  • @Noober: please share the traceback (in a [pastie](http://pastie.com/)) of the new exception with my change applied. I *strongly doubt* that my code will give the exact same exception. – Martijn Pieters Dec 25 '15 at 12:45
  • Both the errors are exactly the same. Here is the pastie link-http://pastie.org/private/xjufbu45whc3nl95xzwaw – Noober Dec 25 '15 at 12:49
  • @Noober: that's not quite my code. You called `sorted` on `list1`. I called `sorted()` on a generator expression that filters out any rows shorter than 2 elements. Can you share the code (in a pastie again) that throws that exception? – Martijn Pieters Dec 25 '15 at 12:51
  • Sorry my bad, I was running your updated code in another version of my code. The error was as you said with some of the rows becoming empty – Noober Dec 25 '15 at 12:54