I have data such as the following:
start_date | end_date | project_sales | project_category | project_code |
---|---|---|---|---|
2015-08-03 | 2015-08-06 | 1683 | CatA | 1 |
2015-08-02 | 2015-08-04 | 6500 | CatB | 2 |
I want to create a timeframe from date 2014-01-01 to 2020-12-01 and assign the details of the above table such as : For e.g. here I created date_period df from 2015-08-02 to 2018-08-06
date_period | CatA_project_sales | CatB_project_sales | CatA_No_of_projects | CatB_No_of_projects |
---|---|---|---|---|
2015-08-02 | 2166 | 0 | 1 | 0 |
2015-08-03 | 2166 | 561 | 1 | 1 |
2015-08-04 | 2166 | 561 | 1 | 1 |
2015-08-05 | 0 | 561 | 0 | 1 |
2015-08-06 | 0 | 561 | 0 | 1 |
As you can see above, project_sales are divided by the number of days it was active. I am very clear with logic but I'm struggling to come up with the code to support that.
Any help is appreciated.
Note: This code will need to run on huge dataframe dating from 2014-01-01 to 2020-12-31, requesting the help keeping this in consideration