I have a table like following
df = pandas.DataFrame(
[[1,datetime.datetime(2018, 1, 1), datetime.datetime(2019, 1, 1),'Joe'],
[2,datetime.datetime(1999, 1, 1), datetime.datetime(2019, 1, 1),'James'],
[3,datetime.datetime(1980, 1, 1), datetime.datetime(2000, 1, 1),'Jack'],
[4,datetime.datetime(1967, 1, 1), datetime.datetime(1975, 1, 1),'Jim']],
columns=['PERSON ID', 'START DATE', 'END DATE','NAME']
I want get a list of colleagues and count them for each person based on their start and end date. Expected output is
Name Number_of_colleagues List_of_colleagues
Joe 1 [James]
James 2 [Joe, Jack]
Jack 1 [James]
Jim 0 []
Any recommendation how to do it? I have tried to use nested for loop to iterate over each row to find it. It works but is really slow for 20000 rows.