I started to work with pandas recently and during testing with 'date' I have found this challenge. Given this dataframe:
df = pd.DataFrame({'id': [123, 431, 652, 763, 234],
'time': ['8/1/2017', '6/1/2015', '7/1/2016', '9/1/2014', '12/1/2018']})
Create the new dataframe with backdate columns look like this:
id time time1 time2 time3 time4 time5
0 123 2017-08-01 2017-07-01 2017-06-01 2017-05-01 2017-04-01 2017-03-01
1 431 2015-06-01 2015-05-01 2015-04-01 2015-03-01 2015-02-01 2015-01-01
2 652 2016-07-01 2016-06-01 2016-05-01 2016-04-01 2016-03-01 2016-02-01
3 763 2014-09-01 2014-08-01 2014-07-01 2014-06-01 2014-05-01 2014-04-01
4 234 2018-12-01 2018-11-01 2018-10-01 2018-09-01 2018-08-01 2018-07-01
I tries with these codes:
df['time'] = pd.to_datetime(df['time'], errors='coerce') #Object to Date
df['time1'] = df['time'] - pd.DateOffset(months=1)
df['time2'] = df['time'] - pd.DateOffset(months=2)
df['time3'] = df['time'] - pd.DateOffset(months=3)
df['time4'] = df['time'] - pd.DateOffset(months=4)
df['time5'] = df['time'] - pd.DateOffset(months=5)
Are there anyway to solve this problem faster and more efficient? I've already tested several methods to create the backdate. However I don't know how to do it with multiple columns. Because if the data requires to backdate 24 months, I have to copy and paste a lot (manually).