The error in the OP occurred because datetime.datetime.strftime
was called without a datetime/date argument in apply()
. The format=
should be passed as a separate argument to apply()
, which will be passed off to strftime()
as the format.
from datetime import datetime
x = dates.apply(datetime.strftime, format='%Y%m%d').astype(int)
If the date were strings (instead of datetime/date), then str.replace()
should do the job.
x = dates.str.replace('-', '').astype(int)
# using apply
x = dates.apply(lambda x: x.replace('-', '')).astype(int)
A mildly interesting(?) thing to note is that both .dt.strftime
and str.replace
of pandas are not optimized, so calling Python's strftime
and str.replace
via apply()
is actually faster than the pandas counterparts (in the case of strftime
, it is much faster).
dates = pd.Series(pd.date_range('2020','2200', freq='d'))
%timeit dates.dt.strftime('%Y%m%d')
# 719 ms ± 41.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit dates.apply(datetime.strftime, format='%Y%m%d')
# 472 ms ± 34.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
dates = dates.astype(str)
%timeit dates.str.replace('-', '')
# 30.9 ms ± 2.46 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit dates.apply(lambda x: x.replace('-', ''))
# 26 ms ± 183 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)