I need your help:
The general idea is to create 1 new column, Group the data by a column (DEP) then count the number of Total lines per group using the column (id).
Then filter the data with another column (dely): (only dely>=60) and count the id
Then calculate the share using the number of rows of (the filtered columns)/ (total number calculated at the beginning).
total= count(id by group)
share = (dely>=60)/total
I was able to do it in 3 steps but I wanted to know if possible to do it in a faster way?
#group the data by ( DEP)
Total_group<-df %>%
group_by(DEP) %>%
summarise(n = n())
filter the data T_depart>60
Filter_60<- df %>% filter(df$T_depart>=60)
#then gorup the filtred data by DEP as I did for the Total
Filter_60_group<-Filter_60 %>%
group_by(DEP) %>%
summarise(n = n())
then calculte the share( share_dep)
share_data<-left_join (Total_group, Filter_60_group, by="DEP") %>% mutate(share_dep=n.x/n.y)
Any idea how to put all this steps in one or 2 step?