I am trying to find a nice clean Python/Pandas way to take the following dataframe:
ID Date_Field Category
12345 01/01/12 AAAA
12345 01/02/12 AAAA
12345 01/03/12 AAAA
12345 01/04/12 BBBB
12345 01/05/12 BBBB
12345 01/06/12 BBBB
2468 01/01/12 AAAA
2468 01/02/12 AAAA
2468 01/03/12 AAAA
2468 01/04/12 BBBB
2468 01/05/12 BBBB
2468 01/06/12 BBBB
2468 01/07/12 BBBB
2468 01/08/12 CCCC
2468 01/09/12 CCCC
2468 01/10/12 AAAA
2468 01/11/12 AAAA
And convert it to a dataframe that looks similar to this, i.e., where I have the begin/end date per ID/Category, but with the caveat that ID/Category combo can occur multiple times if there was a date gap between:
ID Start_Date End_Date Category
12345 01/01/12 01/03/12 AAAA
12345 01/04/12 01/06/12 BBBB
2468 01/01/12 01/03/12 AAAA
2468 01/10/12 01/11/12 AAAA
2468 01/04/12 01/07/12 BBBB
2468 01/08/12 01/09/12 CCCC
I know I can do it via iteration and comparison to previous row values, but I have a feeling there's a far cleaner approach.