This is a follow-up to a this question, which was asked a few years ago. The output I'm getting makes me think date offsets in pandas have changed during that time.
I have dates, and I want to move them to the end of the month that they're in. For example:
df = pd.DataFrame([pd.Timestamp('2014-01-15'), pd.Timestamp('2014-01-31')],
columns=['orig'])
I want to transform both of these to 2014-01-31
. I'm wondering the pandamic way to do this.
The accepted answer in the other question offers two solutions:
1) pd.Index(df.orig).to_period('M').to_timestamp('M')
2) pd.Index(df.orig) + pd.offsets.MonthEnd(0)
However, in pandas 0.17.1 these do not both give the same answer. The first approach works, but the second does something different:
df[0] = df['orig'] + pd.offsets.MonthEnd(0)
df[1] = df['orig'] + pd.offsets.MonthEnd(1)
orig 0 1
0 2014-01-15 2013-12-31 2014-01-31
1 2014-01-31 2013-12-31 2014-02-28
So MonthEnd(0)
moves all dates to the end of the previous month, while MonthEnd(1)
move dates to the end of the current month except if it's already the last day of the month, in which case it moves it to the end of this next month. This seems quite strange.
So, is (1) really the best way to do this? I'm surprised that there's not simple way to accomplish the same thing using offsets.