The other two answers work, but neither is very elegant nor in the spirit of the pandas
library. Instead, consider this, which is also ever so slightly faster in my tests then Kyle Barron's vectorized answer. It's a one liner that does not require defining any outside functions, is vectorized, and stays within the pandas
ecosystem:
import pandas as pd
dtseries.dt.to_period('M').dt.to_timestamp()
This method has the added benefit of supporting many other frequencies to floor to, such as weekly ('W'
) or business days ('B'
) that would be trickier to implement with the vectorized approach above.
You can find the abbreviations for various other frequencies in the relevant doc page.
This of course assumes that dtseries
is a datetime series, if not you can easily convert it with pd.to_datetime(my_series)
.
This solution also allows for great flexibility in using various offsets. For example, to use the tenth day of the month:
from pandas.tseries.offsets import DateOffset
dtseries.dt.to_period('M').dt.to_timestamp() + DateOffset(days=10)
I recommend you check the doc for pandas offsets. The offsets pandas provides support a lot of rather complex offsets, such as business days, holidays, business hours, etc... Those would be extremely cumbersome to implement by hand as proposed by the answers of @KyleBarron and @JonClements. Consider this example for instance, to get dates offset 5 business days from the start of the month:
from pandas.tseries.offsets import BusinessDay
dtseries.dt.to_period('M').dt.to_timestamp() + BusinessDay(n=5)