Probably an extension to : De-aggregate / reverse-summarise / expand a dataset in R I have a dataframe that looks like :
I_Code Date_1 Date_2
2 14/09/2019 16/08/2019
2 14/09/2019 17/08/2019
2 14/09/2019 19/08/2019
2 14/09/2019 20/08/2019
2 14/09/2019 21/08/2019
2 14/09/2019 21/08/2019
2 14/09/2019 21/08/2019
2 14/09/2019 22/08/2019
2 14/09/2019 23/08/2019
2 14/09/2019 23/08/2019
2 14/09/2019 24/08/2019
2 14/09/2019 27/08/2019
2 14/09/2019 28/08/2019
2 14/09/2019 28/08/2019
2 14/09/2019 29/08/2019
2 14/09/2019 04/09/2019
2 14/09/2019 04/09/2019
2 14/09/2019 04/09/2019
2 14/09/2019 05/09/2019
2 14/09/2019 08/09/2019
2 14/09/2019 10/09/2019
2 14/09/2019 10/09/2019
2 14/09/2019 12/09/2019
I_code can take 5 values.
I have another dataframe that looks like :
date_2 count
20/09/2019 415
19/09/2019 431
31/08/2019 386
24/09/2019 404
11/08/2019 252
27/09/2019 441
28/09/2019 398
17/09/2019 430
07/09/2019 388
10/09/2019 369
22/08/2019 318
25/09/2019 420
25/08/2019 380
17/08/2019 291
01/09/2019 381
30/08/2019 345
22/09/2019 455
07/09/2019 388
09/08/2019 213
24/09/2019 404
23/08/2019 344
17/08/2019 291
07/09/2019 388
here the count represents the frequency of the date_2.
I want to expand the dataframe1 to 20k entries based on frequency of date_2 from dataframe2.
for each date_2 in dataframe 1, we have 5 I_code associated. I want to distribute them equally into the dataframe.
for example :
20/09/2019 has a total of 415 count, then we need to have 415 entries of (I_code, Date_1, Date_2). There might be multiple tuple entries for 3 fields in the dataframe1. we need to pick all the values associated with date_2 and then expand all the entries based on the frequency of date_2 in dataframe2.
Can anyone help.