0

input:

data["Date"] = ["2005-01-01", "2005-01-02" , ""2005-01-03" ,..., "2014-12-30","2014-12-31"]

how can i sort the column such that the column gives 1st date of every year, 2nd date of every and so on:

i.e.

output:

data["Date"] = ["2005-01-01","2006-01-01","2007-01-01", ... "2013-12-31","2014-12-31"]

NOTE: assuming the date column has no leap days

Ardein_
  • 166
  • 6

5 Answers5

0

Try using lambda expressions.

from datetime import datetime

data = {"Date": ["2005-01-02", "2005-01-01", "2014-12-30", "2014-12-31"]}
data["Date"].sort(key=lambda date: datetime.strptime(date, "%Y-%m-%d"))
Ardein_
  • 166
  • 6
0
>>> import datetime
>>> dates = [datetime.datetime.strptime(ts, "%Y-%m-%d") for ts in data["Date"]]
>>> dates.sort()
>>> sorteddates = [datetime.datetime.strftime(ts, "%Y-%m-%d") for ts in dates]
>>> sorteddates
['2010-01-12', '2010-01-14', '2010-02-07', '2010-02-11', '2010-11-16', '2010-11-
22', '2010-11-23', '2010-11-26', '2010-12-02', '2010-12-13', '2011-02-04', '2011
-06-02', '2011-08-05', '2011-11-30']
Rajnish kumar
  • 186
  • 1
  • 14
0

First:

data['D'] = data['Date'].apply(lambda x : datetime.datetime.strptime(x, '%Y-%m-%d'))
data['Day'] = data['D'].apply(lambda x: x.day)
data['Month'] = data['D'].apply(lambda x: x.month)
data['Year'] = data['D'].apply(lambda x: x.year)
data.drop(columns='D', inplace=True)

Then, having 4 columns dataframe, we sort as following:

data.sort_values(by=['Day','Month','Year'], inplace=True)

Finally, you can drop new columns if you won't need them:

data.drop(columns = ['Day','Month','Year'], inplace=True)
0

Why dont't you try and create a new column in which you change the format of the date? Like this :

def change_format(row):
    date_parts = row.split('-')
    new_date = date_parts(2)+"-"+date_parts(1)+"-"+date_parts(0)
    return new_date

data["Date_new_format"] = data["Date"].apply(lambda row => change_format(row))

Now you can sort your dataframe according to the column Date_new_format and you will get what you need.

Catalina Chircu
  • 1,506
  • 2
  • 8
  • 19
0

Use:

data["temp"] = pd.to_datetime(data["Date"]).dt.strftime("%d-%Y-%m")
data = data.sort_values(by="temp").drop(columns=["temp"])
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53