I have a list of contracts with start and end dates.
How can I compute the number of overlapping contracts during the life span of a contracts?
df = pd.DataFrame({
'contract': pd.Series(['A1', 'A2', 'A3', 'A4']),
'start': pd.Series(['01/01/2015', '03/02/2015', '15/01/2015', '10/01/2015']),
'end': pd.Series(['16/01/2015', '10/02/2015', '18/01/2015', '12/01/2015'])
})
which gives:
contract end start
0 A1 16/01/2015 01/01/2015
1 A2 10/02/2015 03/02/2015
2 A3 18/01/2015 15/01/2015
3 A4 12/01/2015 10/01/2015
A1 overlaps with A3 and A4, therefore overlaps = 2. A2 overlaps with no contract, therefore overlaps = 0. A3 overlaps with A1, therefore overlaps = 1. A4 overlaps with A1, therefore overlaps = 1.
I could just compare each time span (start to end) but that is O(n**2)
Any better idea?
I have the feeling an improvement could be gained by sorting and then looping through the sorted ranges