I am new to this and probably might be a naive question to even ask. I want to generate a random dataset with some constraints :
date_1 - already generated in csv (Dated from 1 august 2018- 1 august 2019)
date_2 - 60% of the data lies within the 30 days from the date_1 and 40% of the data lies within 90 days of the date_2.
capacity_1 - 3500 kgs is the threshold for a day. Cannot exceed the same for date_2
capacity_2 - leftout weight for the day. its 3500-capacity_1 for a particular day.
The date_1 format that I have is d/m/y
Can anyone advise me as to how to achieve the other columns as well. I am planning to build the dummy data with 100,000 rows.
Edit : Attaching the csv file for the data here
EDIT2 : The input would look like :
date_1
01/08/2018
01/08/2018
01/08/2018
01/08/2018
01/08/2018
01/08/2018
01/08/2018
01/08/2018
01/08/2018
01/08/2018
01/08/2018
01/08/2018
Expected Output :
Here capacity_2 would be 3500-capacity_1 for a particular date_2. capacity_2 basically would give the idea of how much out of 3500 has been used for a particular date.
Thanks