Attendance:
I am working on the attendance data as shown in the image, using pandas. punch_time is the time the employee checked in or checked-out. punch_state takes on two values i.e. 0 for check-in and 1 for check-out
What I am trying to do is find the check-in and check-out time and store them in the same row for each employee something like this:
The employees work on two shifts morning and evening which can be determined from the punch_time and punch_state.
I tried creating a pivot table using pandas. It works fine for 1st shift because they have check-in and check-out on the same day but it is causing issue in the 2nd shift data when check-in is on one day and check-out is on the next day. This issue is significant especially when there's a Sunday (Sundays are off) in between so the data for Saturday check-in and check-out and Monday's check-in and check-out. You can see the output:
In output, you can see that check-out(1) time of Saturday i.e. 2023-03-11 is not properly assigned instead it's in the 2023-03-12(Sunday) column check-out(1)
Here's the pivot table code I wrote:
df['date'] = pd.to_datetime(df['date'])
# Create a new column for day of week
df['day_of_week'] = df['date'].dt.dayofweek
# Filter out Sundays where employee has not checked in
sundays = df[(df['day_of_week'] == 6) & (df['punch_state'] == 1)]['date']
df = df[(df['day_of_week'] != 6) | (df['date'].isin(sundays))]
# Create pivot table
pivot_table = pd.pivot_table(df, values=['time'], index=['employee_name', 'date'], columns='punch_state', aggfunc='first')
pivot_table = pivot_table.reset_index()
Is there any way I can fix this? Any help would be greatly appreciated.