I have some timestamps, and am trying to find the timestamp at the start of the month in which they occur. I found a solution, but it's convoluted and points to unintuitive usage of the methods used. I wonder if there is a better solution.
Here are 3 samples, which should all have the same month start: Timestamp('2020-01-01 00:00:00+0100', tz='Europe/Berlin')
.
import pandas as pd
stamps = [
pd.Timestamp("2020-01-01 00:00", tz="Europe/Berlin"),
pd.Timestamp("2020-01-01 03:00", tz="Europe/Berlin"),
pd.Timestamp("2020-01-27 18:00", tz="Europe/Berlin"),
]
First attempt:
In: [ts + pd.offsets.MonthBegin(0) for ts in stamps]
Out: [Timestamp('2020-01-01 00:00:00+0100', tz='Europe/Berlin'),
Timestamp('2020-01-01 03:00:00+0100', tz='Europe/Berlin'),
Timestamp('2020-02-01 18:00:00+0100', tz='Europe/Berlin')]
Before addressing anything else: for me, the month starts at midnight, so let's first get rid of the time:
In: [ts.floor('D') + pd.offsets.MonthBegin(0) for ts in stamps]
Out: [Timestamp('2020-01-01 00:00:00+0100', tz='Europe/Berlin'),
Timestamp('2020-01-01 00:00:00+0100', tz='Europe/Berlin'),
Timestamp('2020-02-01 00:00:00+0100', tz='Europe/Berlin')]
Midnight, good, but I don't know why it returns Feb as the month begin of last timestamp. (This behaviour, with the final timestamp being assigned a different month than the other 2, exists for all arguments <= 0.)
Third attempt:
In: [ts.floor('D') + pd.offsets.MonthBegin(1) for ts in stamps]
Out: [Timestamp('2020-02-01 00:00:00+0100', tz='Europe/Berlin'),
Timestamp('2020-02-01 00:00:00+0100', tz='Europe/Berlin'),
Timestamp('2020-02-01 00:00:00+0100', tz='Europe/Berlin')]
Still wrong, but at least they all have the same timestamp. Have to subtract one month:
In: [ts.floor('D') + pd.offsets.MonthBegin(1) + pd.offsets.MonthBegin(-1) for ts in stamps]
Out: [Timestamp('2020-01-01 00:00:00+0100', tz='Europe/Berlin'),
Timestamp('2020-01-01 00:00:00+0100', tz='Europe/Berlin'),
Timestamp('2020-01-01 00:00:00+0100', tz='Europe/Berlin')]
This is what I need. But as I wrote, it seems overly convoluted. My questions:
a) Is there a better/shorter way?
b) Is this the intended behaviour of the MonthBegin
attribute? Or could it be a bug?
Many thanks