I have a dataframe which I am creating by reading an Excel file:
Project Release Name Cycle Name Cycle Start Date Cycle End Date Exec Date Planned Exec Date Available Test Cases Planned Tested Passed Failed Blocked No Run Tester
B1 Y1 CM1 2/7/2018 2/20/2018 2/6/2018 2/6/2018 2 10 8 8 0 0 0 Tester3
B1 Y1 CM1 2/7/2018 2/20/2018 2/7/2018 2/7/2018 2 13 10 9 1 1 0 Tester3
B1 Y1 CM1 2/7/2018 2/20/2018 2/8/2018 2/8/2018 0 1 1 1 0 0 0 Tester3
B1 Y1 CM1 2/7/2018 2/20/2018 2/9/2018 2/9/2018 0 2 2 2 0 0 0 Tester3
B1 Y1 CM1 2/7/2018 2/20/2018 2/10/2018 2/10/2018 0 2 2 2 0 0 0 Tester3
B1 Y1 CL1 2/7/2018 2/25/2018 2/1/2018 2/1/2018 5 25 20 20 0 0 0 Tester 4
B1 Y1 CL1 2/7/2018 2/25/2018 2/2/2018 2/2/2018 10 30 20 18 2 0 0 Tester 4
B1 Y1 CL1 2/7/2018 2/25/2018 2/3/2018 2/3/2018 0 2 2 0 2 0 0 Tester 4
B1 Y1 CL1 1/17/2018 2/25/2018 2/4/2018 2/4/2018 0 3 3 1 2 0 0 Tester 4
B1 Y1 CL1 1/17/2018 2/25/2018 2/5/2018 2/5/2018 5 32 25 20 4 1 0 Tester 4
C1 Z1 CK1 1/10/2018 2/20/2018 2/3/2018 2/3/2018 0 1 1 0 1 0 0 Tester5
C1 Z1 CK1 1/10/2018 2/20/2018 2/4/2018 2/4/2018 0 1 1 0 1 0 0 Tester5
C1 Z1 CK1 1/10/2018 2/20/2018 2/5/2018 2/5/2018 0 1 1 0 1 0 0 Tester5
C1 Z1 CK1 1/10/2018 2/20/2018 2/6/2018 2/6/2018 0 1 1 1 0 0 0 Tester5
C1 Z1 CK1 1/10/2018 2/20/2018 2/7/2018 2/7/2018 0 1 1 1 0 0 0 Tester6
C1 Z1 CK1 1/10/2018 2/20/2018 2/8/2018 2/8/2018 0 1 1 1 0 0 0 Tester6
C1 Z1 CK2 1/17/2018 2/18/2018 2/6/2018 2/6/2018 0 1 1 1 0 0 0 Tester6
C1 Z1 CK2 1/17/2018 2/18/2018 2/7/2018 2/7/2018 0 2 2 0 2 0 0 Tester6
C1 Z1 CK2 1/17/2018 2/18/2018 2/8/2018 2/8/2018 0 2 2 0 2 0 0 Tester7
C1 Z1 CK2 1/17/2018 2/18/2018 2/9/2018 2/9/2018 0 2 2 0 2 0 0 Tester7
C1 Z1 CK2 1/17/2018 2/18/2018 2/10/2018 2/10/2018 0 2 2 1 1 0 0 Tester7
C1 Z1 CK2 1/17/2018 2/18/2018 2/11/2018 2/11/2018 0 2 2 2 0 0 0 Tester7
And I am using pandas groupby as follows:
dx1 = pd.read_excel('Trend.xlsx',sheetname='Execution by Date')
dx1 = dx1.groupby(['Project', 'Release Name', 'Cycle Name', 'Cycle Start Date',
'Cycle End Date'])['Exec Date','Planned Exec Date', 'Available Test Cases', 'Planned', 'Tested', 'Passed', 'Failed',
'Blocked'].sum().reset_index()
and here is the result I get:
Project Release Name Cycle Name Cycle Start Date Cycle End Date Available Test Cases Planned Tested Passed Failed Blocked
B1 Y1 CL1 2018-01-17 00:00:00 2018-02-25 00:00:00 5 35 28 21 6 1
B1 Y1 CL1 2018-02-07 00:00:00 2018-02-25 00:00:00 15 57 42 38 4 0
B1 Y1 CM1 2018-02-07 00:00:00 2018-02-20 00:00:00 4 28 23 22 1 1
C1 Z1 CK1 2018-01-10 00:00:00 2018-02-20 00:00:00 0 6 6 3 3 0
C1 Z1 CK2 2018-01-17 00:00:00 2018-02-18 00:00:00 0 11 11 4 7 0
As you can see that, 'Exec Date' and 'Planned Exec Date' are missing.
How can I bring both of the missing date columns back into the dataframe. I have tried all seemingly relevant solutions but none worked for me.