I have a timeseries without every date (ie. trading dates). Series can be reproduced here.
dates=pd.Series(np.random.randint(100,size=30),index=pd.to_datetime(['2010-01-04', '2010-01-05', '2010-01-06', '2010-01-07',
'2010-01-08', '2010-01-11', '2010-01-12', '2010-01-13',
'2010-01-14', '2010-01-15', '2010-01-19', '2010-01-20',
'2010-01-21', '2010-01-22', '2010-01-25', '2010-01-26',
'2010-01-27', '2010-01-28', '2010-01-29', '2010-02-01',
'2010-02-02', '2010-02-03', '2010-02-04', '2010-02-05',
'2010-02-08', '2010-02-09', '2010-02-10', '2010-02-11',
'2010-02-12', '2010-02-16']))
I would like the last day of the month in my list of dates ie: '2010-01-29' and '2010-02-16'
I have looked at Get the last date of each month in a list of dates in Python
and more specifically...
import pandas as pd
import numpy as np
df = pd.read_csv('/path/to/file/') # Load a dataframe with your file
df.index = df['my_date_field'] # set the dataframe index with your date
dfg = df.groupby(pd.TimeGrouper(freq='M')) # group by month / alternatively use MS for Month Start / referencing the previously created object
# Finally, find the max date in each month
dfg.agg({'my_date_field': np.max})
# To specifically coerce the results of the groupby to a list:
dfg.agg({'my_date_field': np.max})['my_date_field'].tolist()
... but can't quite figure out how to adapt this to my application. Thanks in advance.