1

I have been reading several posts (1, 2, 3) about this but I can't make it work yet. I have a (simplified) CSV file like this:

NOMBRE,APELLIDO,ID,NACIMIENTO,FECHAINGRESO,MAILPERSONAL,DEPARTAMENTO
name1,lastname1,123,2000-01-01,2021-03-13,mymail1@example-com,IT
name2,lastname2,456,1999-01-01,2020-01-21,mymail2@example-com,IT

I want to sort it according to the header FECHAINGRESO, oldest date first, but don't how to do this. I have tried this with python 3.8.5 in Ubuntu 20:

import csv
import os
from datetime import datetime
# With this I read the cvs and print it to check if everything is ok
with open('Empleados.csv', newline='') as csvfile:
   spamreader = csv.reader(csvfile, delimiter=',')
   for row in spamreader:
       print(', '.join(row))
# The next is the code from several attempts where I failed to sort the cvs
with open('Empleados.csv', newline='') as csvfile:
    # I wrote 4 because I belive the position 4 in the headers' row is the one with FECHAINGRESO
    csvfile = sorted(csvfile, key = lambda row: datetime.strptime(row[4], "%d-%m-%Y"))
    print(csvfile)
    s = sorted(csvfile, key=lambda x:datetime.strptime(x[4],"%d-%m-%Y"), reverse=True)
    print(s)
    l = sorted(csvfile, key=lambda x: x[4], reverse=True)
    print(l)
    sortedlist = sorted(csvfile, key=operator.itemgetter(4), reverse=False)
    print(sortedlist)
    sortedlist = sorted(csvfile, key=lambda row: row[4], reverse=True)
    print(sortedlist)

Basically none of them worked because it reads the row like a string and usually returns this error:

  File "/home/Pruebas VSC/prueba_postgresql.py", line 31, in <module>
    csvfile = sorted(csvfile, key = lambda row: datetime.strptime(row[4], "%d-%m-%Y"))
  File "/home/Pruebas VSC/prueba_postgresql.py", line 31, in <lambda>
    csvfile = sorted(csvfile, key = lambda row: datetime.strptime(row[4], "%d-%m-%Y"))
  File "/usr/lib/python3.8/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/usr/lib/python3.8/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data 'E' does not match format '%d-%m-%Y'

Some of them don't fail, but they don't sort by the date in that column the CSV.

Everything in the last with-open are codes from other questions I found in google, but I don't understand everything. Hope someone can help me understand how to sort this list. I want to save the CSV file later, but I believe that should be easy with write after it is sorted.

martineau
  • 119,623
  • 25
  • 170
  • 301
Alex Turner
  • 698
  • 6
  • 16

2 Answers2

2

As @barny pointed out, you need to have a list. Also as @DeepSpace pointed out you can perform the sorted operation on the iterable.

A couple other issues. The sorting only works without the header row (which doesn't contain a date). So I removed it to test the following. Also I had to tweak your date format string to be "%Y-%m-%d" to match your data.

So altogether:

with open('Empleados.csv', newline='') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',')
    spamreader = sorted(spamreader, key=lambda x:datetime.strptime(x[4],"%Y-%m-%d"), reverse=True)

Edit: If you want to handle the header cleanly as well (without trying it yourself for learning exercise), use the following.

with open('Empleados.csv', newline='') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',')
    header = next(spamreader)
    spamreader = sorted(spamreader, key=lambda x:datetime.strptime(x[4],"%Y-%m-%d"), reverse=True)
    spamreader.insert(0, header)
William
  • 381
  • 1
  • 8
  • Did you try it? it does not work since it tries to convert the string `'FECHAINGRESO'` to a date – DeepSpace Jun 23 '21 at 19:58
  • Yes I tried it. Didn't you see in my post where I said I had to remove the header row? – William Jun 23 '21 at 19:58
  • ,,, which does not help the OP since their file is a csv file with a header... this should be handled by code, not by modifying the input file in a way that makes it useless – DeepSpace Jun 23 '21 at 19:59
  • Sure it helps. They can figure out how to fill in the gaps. Are we supposed to do everything for them? It's a learning website. – William Jun 23 '21 at 20:00
  • Both codes worked fine. I prefer the second one because you don't have to change the data/csv. Thanks – Alex Turner Jun 23 '21 at 20:43
0

If you're open to external libraries, the best way to do this is to use the power of pandas. This is what pandas was made for.

import pandas as pd
df = pd.read_csv('Empleados.csv')
df = df.sort_values('FECHAINGRESO', ascending=False)
df

# to save the sorted csv
df.to_csv('Empleados_sorted.csv')
William
  • 381
  • 1
  • 8