3

I'm using Python 2.7, PyCharm and Anaconda,

I have a list of dates and I'd like to retrieve the last date of each month present in the array.

Are there any functions or libraries that could help me to do this?

I read the dates from a CSV file and stored them as datetime.

I have the following code:

Dates=[]
Dates1=[]
for date in dates:
    temp=xlrd.xldate_as_tuple(int(date),0)
    Dates1.append(datetime.datetime(temp[0],temp[1],temp[2]))

for date in Dates1:
    if not (date<startDate or date>endDate):
        Dates.append(date)

To make it clear, suppose I have:

Dates = [2015-01-20, 2015-01-15, 2015-01-17, 2015-02-21, 2015-02-06] 

(Consider it being in datetime format.)

The list I'd like to retrieve is:

[2015-01-20, 2015-02-21]

So far I've googled around, especially in Stack Overflow, but I could only find answers to how I could get the last date of each month, but not from a user-specified list.

Michael Laszlo
  • 12,009
  • 2
  • 29
  • 47
Pedro Braz
  • 2,261
  • 3
  • 25
  • 48
  • 2
    Can you show us the code you wrote that produces these dates? What have you tried so far? – rayryeng Sep 16 '15 at 14:58
  • Find the actual last date of the month and substract each date from that date. Smallest number is the closest. – Bob Dylan Sep 16 '15 at 15:16
  • Please see my answer. If that works for you, could you please accept it and upvote it? – Michael Laszlo Sep 16 '15 at 16:12
  • I think the question is not completely clear. Do you want the last valid date for a particular month (so if you have [2015-01-20, 2015-01-15, 2015-01-17] then you want 2015-01-31)? Or do you want the latest *among your collection* (so with those three dates, you want 2015-01-20)? If you want 2015-01-31, then your question is a duplicate of [this one](http://stackoverflow.com/questions/42950/get-last-day-of-the-month-in-python). – John Y Sep 16 '15 at 17:28
  • I want the chronologically last. Sure @Michael Laszlo ! let me just test it and I'll be sure to accept it. Thanks all ! – Pedro Braz Sep 16 '15 at 17:32
  • 1
    You still didn't really answer my question, but it sounds like you want 2015-01-20, if given the list of three dates in my previous comment. Is that correct? Most people (including both of the answers below) have interpreted your question to mean you want 2015-01-31. – John Y Sep 16 '15 at 17:59
  • Yes I would like 2015-01-20. Thanks I'll edit the question to make it clearer ! – Pedro Braz Sep 16 '15 at 18:03
  • 1
    Sorry, I should have said that only Michael Laszlo's answer had the mistaken interpretation; leroyJr seems to have read your question the way you wanted. – John Y Sep 16 '15 at 18:06

3 Answers3

4

For year y and month m, calendar.monthrange(y, m)[1] returns the day number of the last day of the month.

The following script takes a list of datetime object called dates and makes a new list, month_last_dates, containing datetime objects corresponding to the last date of each month in which the members of dates fall.

import datetime
import calendar

tuples = [(2015, 8, 1), (2015, 9, 16), (2015, 10, 4)]
dates = [datetime.datetime(y, m, d) for y, m, d in tuples]

month_last_dates = len(dates) * [None]
for i, date in enumerate(dates):
  y, m, d = date.year, date.month, date.day
  last = calendar.monthrange(y, m)[1]
  print y, m, last  # Output for testing purposes.
  month_last_dates[i] = datetime.datetime(y, m, last)

Here is an equivalent script written more concisely with the help of a list comprehension:

import datetime
import calendar

tuples = [(2015, 8, 1), (2015, 9, 16), (2015, 10, 4)]
dates = [datetime.datetime(y, m, d) for y, m, d in tuples]

month_last_dates = [datetime.datetime(date.year, date.month,
      calendar.monthrange(date.year, date.month)[1]) for date in dates]

# Output for testing purposes.
for date in month_last_dates:
  print date.year, date.month, date.day

In your case, given the list Dates, you can make a new list like this:

last_dates = [datetime.datetime(date.year, date.month,
      calendar.monthrange(date.year, date.month)[1]) for date in Dates]
Michael Laszlo
  • 12,009
  • 2
  • 29
  • 47
  • That is a good answer but the resulting list has the same size as the previous, ideally I'd like a list with the unique dates. Any ideas ? thanks ! – Pedro Braz Sep 16 '15 at 17:50
  • 1
    Sure, you can do that to `last_dates`. Write: `last_dates = list(set(last_dates))` after the code I gave you above. – Michael Laszlo Sep 16 '15 at 17:54
4

Pandas can handle this task really well. Load your csv to a dataframe, then run a group by the month and find the max date using the aggregate function:

import pandas as pd
import numpy as np

df = pd.read_csv('/path/to/file/')          # Load a dataframe with your file
df.index = df['my_date_field']              # set the dataframe index with your date
dfg = df.groupby(pd.TimeGrouper(freq='M'))  # group by month / alternatively use MS for Month Start / referencing the previously created object

# Finally, find the max date in each month
dfg.agg({'my_date_field': np.max})

# To specifically coerce the results of the groupby to a list:
dfg.agg({'my_date_field': np.max})['my_date_field'].tolist()
leroyJr
  • 1,110
  • 9
  • 17
  • still what I get is a `` , and I can't seem to access it as a normal list/array. like dfg[0] – Pedro Braz Sep 16 '15 at 17:44
  • Just one more step to change from the DataFrameGroupBy to a list: `dfg.agg({'a_date': np.max})['a_date'].tolist() # Select the column and coerce` – leroyJr Sep 16 '15 at 18:09
  • 1
    If there's only a single column selected in your dataframe projection you issue a simpler aggregate command (just the function): `Dates.groupby(pd.TimeGrouper(freq='M')).agg(np.max).tolist()` – leroyJr Sep 16 '15 at 18:22
0

This works perfectly, using just one line of code:

dates = [datetime.date(2023, 3, 21), datetime.date(2023, 3, 1), datetime.date(2023, 2, 22), datetime.date(2023, 2, 14)]
eom_dates = set(pd.Series(dates,index=dates).groupby(lambda x:x.month).max())

This results in:

{datetime.date(2023, 2, 22), datetime.date(2023, 3, 21)}