I have a dataframe and am trying to calculate the time difference between two different topics while remaining within a call and not spilling over into a new call (i.e while ensuring it's not working out the time difference between topics in different calls). Where the interaction_id is a seperate call
This is an example Dataframe
df = pd.DataFrame([[1, 2, 'Cost'], [1, 5.72, NaN], [1, 8.83, 'Billing'], [1, 12.86, NaN], [2, 2, 'Cost'], [2, 6.75, NaN], [2, 8.54, NaN], [3, 1.5, 'Payments'],[3, 3.65, 'Products']], columns=['interaction_id', 'start_time', 'topic'])
interaction_id start_time topic
1 2 Cost
1 5.72 NaN
1 8.83 Billing
1 12.86 NaN
2 2 Cost
2 6.75 NaN
2 8.54 NaN
3 1.5 Payments
3 3.65 Products
An this is the Desired Output
df2 = pd.DataFrame([[1, 2, 'Cost',6.83], [1, 5.72, NaN, NaN], [1, 8.83, 'Billing',4.03], [1, 12.86, NaN,NaN], [2, 2, 'Cost',6.54], [2, 6.75, NaN, NaN], [2, 8.54, NaN, NaN], [3, 1.5, 'Payments', 2.15],[3, 3.65, 'Products','...']], columns=['interaction_id', 'start_time', 'topic','topic_length])
interaction_id start_time topic topic_length
1 2 Cost 6.83
1 5.72 NaN NaN
1 8.83 Billing 4.03
1 12.86 NaN NaN
2 2 Cost 6.54
2 6.75 NaN NaN
2 8.54 NaN NaN
3 1.5 Payments 2.15
3 3.65 Products ....
I hope that makes sense