I have data with 6 Variables : EmployeeID, JobID, Name, JobLocation, Date, and HoursWorked. I want to group my data by EmployeeID and JobID (i.e. find all records with the same EmployeeID and JobID in one row), and then find the minimum and maximum date by group, and the sum of all HoursWorked between those dates. I want my data to end up with columns: EmployeeID, JobID, Name JobLocation, MinDate, MaxDate, TotalHoursWorked.
I have tried this so far but the MinDate,MaxDate, and TotalHoursWorked are showing the same date for every record.
Data$EmployeeID<- as.factor(Data$EmployeeID)
Data$JobID<- as.factor(Data$JobID)
Data$Date<- as.factor(Data$Date)
Data$Date<- as.Date(Data$Date,format="%m/%d/%Y")
Data$HoursWorked<-as.numeric(Data$HoursWorked)
Data<-Data[c("EmployeeID", "Name","JobID", "JobLocation", "Date", "HoursWorked")]
Data<- Data%>%
group_by(Data$EmployeeID,Data$JobID, Data$Name,Data$JobLocation) %>%
summarize(TotalHoursWorked = sum(Data$HoursWorked)) %>%
mutate(MaxDate=max(Data$Date), MinDate=min(Data$Date))
Output of sample(Data) without the "Name" column:
> sample(Data)
# A tibble: 1,000 x 5
EmployeeID HoursWorked JobID Date JobLocation
<fct> <dbl> <fct> <date> <chr>
1 32589 4 B3031-002513-00 2016-03-14 #
2 32590 8 B3031-002562-00 2016-04-08 #
3 32591 9 B3031-002564-00 2016-04-05 #
4 32591 2.5 B3031-002564-00 2016-04-06 #
5 32591 3 B3031-002562-00 2016-04-07 #
6 32591 7.5 B3031-002562-00 2016-04-08 #
7 32605 0 B3031-002348-00 2016-01-04 #
8 32605 3 B3031-002419-00 2016-01-04 #
9 32605 0 B3031-002348-00 2016-01-05 #
10 32605 3 B3031-002419-00 2016-01-05 #
# ... with 990 more rows
And output after I run the group_by code:
> sample(Data)
# A tibble: 80 x 6
MaxDate `Data$JobID` MinDate `Data$\`Job Location\`` TotalHoursWorked `Data$EmployeeID`
<date> <fct> <date> <chr> <dbl> <fct>
1 2016-07-29 B3031-002513-00 2016-01-04 # 3288. 32589
2 2016-07-29 B3031-002562-00 2016-01-04 # 3288. 32590
3 2016-07-29 B3031-002562-00 2016-01-04 # 3288. 32591
4 2016-07-29 B3031-002564-00 2016-01-04 # 3288. 32591
5 2016-07-29 B3031-002348-00 2016-01-04 # 3288. 32605
6 2016-07-29 B3031-002419-00 2016-01-04 # 3288. 32605
7 2016-07-29 B3031-002445-00 2016-01-04 # 3288. 32605
8 2016-07-29 B3031-002502-00 2016-01-04 # 3288. 32605
9 2016-07-29 B3031-002504-00 2016-01-04 # 3288. 32605
10 2016-07-29 B3031-002505-00 2016-01-04 # 3288. 32605
# ... with 70 more rows