Let's suppose we have this dataframe called 'chosen_data' that contains start & end dates & duration of available slots in which we can perform a task (I shall provide a sample so you can reproduce it if you wish). Let's fix the task duration to x = 24h, if my first available slot's duration is 100hours I want to subdivide it into 100-24=76 portions where each one's duration is equal to 'x'. I want to store these portions into a new dataframe (called result) containing start & end dates of each subdivision.
The actual script does the job however it takes forever to run when I apply it on bigger Datasets, I just need to replace the for-loops :
import pandas as pd
import numpy as np
data = [['3/6/1999 5:00','3/8/1999 0:00',43],['3/8/1999 22:00','3/12/1999 19:00',93],['3/12/1999 22:00','3/21/1999 20:00',214],['3/22/1999 19:00','3/26/1999 3:00',80]]
chosen_data = pd.DataFrame(data, columns = ['Start Date', 'End Date', 'Duration [Hours]'])
chosen_data[['Start Date','End Date']] = chosen_data[['Start Date','End Date']].apply(pd.to_datetime)
result = pd.DataFrame(columns=['Start Date','End Date', 'Duration [Hours]'])
x = 24 # Task duration(normaly it's a user input variable)
for i in chosen_data.index :
fenetre = chosen_data.copy()
num_subdiv = fenetre['Duration [Hours]'].loc[i] - x
for j in range(int(num_subdiv)) :
fenetre['Start Date'].loc[i] = fenetre['Start Date'].loc[i] + pd.to_timedelta(1, unit ='h')
#fenetre['Start Date'].loc[i] = fenetre['Start Date'].loc[i] + pd.to_timedelta(j, unit ='h')
fenetre['End Date'].loc[i] = fenetre['Start Date'].loc[i] + pd.to_timedelta(x , unit ='h')
#fenetre['End Date'].loc[i] = fenetre['End Date'].loc[i] + pd.to_timedelta(j - substract , unit ='h')
result = result.append(fenetre.loc[i] , ignore_index= True)
result['Duration [Hours]'] = (result['End Date'] - result['Start Date']) / pd.to_timedelta(1, unit = 'h')
I'm looking for a better way to achieve this purpose, here's the desired output :