I have a dataset called weather and it contains one column 'Date' that looks like this.
Date |
---|
2020-01-01 |
2020-01-02 |
2020-02-01 |
2020-02-04 |
2020-03-01 |
2020-04-01 |
2020-04-02 |
2020-04-03 |
2020-04-04 |
2020-05-01 |
2020-06-01 |
2020-07-01 |
2020-08-01 |
2020-09-01 |
2020-10-01 |
2020-11-01 |
2020-01-01 |
2020-02-01 |
2020-04-01 |
2020-05-01 |
2020-06-01 |
2020-07-01 |
2020-08-01 |
2020-09-01 |
2020-10-01 |
2020-11-01 |
2020-12-01 |
2020-01-01 |
The problem is the year is always 2020 when it should be 2020, 2021, and 2022.
The desired column looks like this
Date |
---|
2020-01-01 |
2020-01-02 |
2020-02-01 |
2020-02-04 |
2020-03-01 |
2020-04-01 |
2020-04-02 |
2020-04-03 |
2020-04-04 |
2020-05-01 |
2020-06-01 |
2020-07-01 |
2020-08-01 |
2020-09-01 |
2020-10-01 |
2020-11-01 |
2021-01-01 |
2021-02-01 |
2021-04-01 |
2021-05-01 |
2021-06-01 |
2021-07-01 |
2021-08-01 |
2021-09-01 |
2021-10-01 |
2021-11-01 |
2021-12-01 |
2022-01-01 |
Each year's last month is not necessarily 12, but the new year starts with month 01.
Here is my code:
month = ['01','02','03','04','05','06','07','08','09','10','11','12']
for i in range(len(weather['Date'])):
year = 2022
for j in range(len(month)):
if weather['Date'][i][5:7] == '01':
weather['Date'][i] = weather['Date'][i].apply(lambda x: 'year' + x[5:])
Is there any suggestion for fixing my code and getting the desired column?