0

I need your help:

The general idea is to create 1 new column, Group the data by a column (DEP) then count the number of Total lines per group using the column (id).

Then filter the data with another column (dely): (only dely>=60) and count the id

Then calculate the share using the number of rows of (the filtered columns)/ (total number calculated at the beginning).

total= count(id by group)

share = (dely>=60)/total

I was able to do it in 3 steps but I wanted to know if possible to do it in a faster way?

#group the data by ( DEP)

Total_group<-df %>%
  group_by(DEP) %>%
  summarise(n = n())

filter the data T_depart>60

Filter_60<- df %>% filter(df$T_depart>=60)

#then gorup the filtred data by DEP as I did for the Total

Filter_60_group<-Filter_60 %>%
group_by(DEP) %>%
summarise(n = n())

then calculte the share( share_dep)

share_data<-left_join (Total_group, Filter_60_group, by="DEP") %>%  mutate(share_dep=n.x/n.y)

Any idea how to put all this steps in one or 2 step?

Sony
  • 19
  • 3
  • Welcome to SO, Sony! Questions on SO (especially in R) do much better if they are reproducible and self-contained. By that I mean including sample representative data (perhaps via `dput(head(x))` or building data programmatically (e.g., `data.frame(...)`), possibly stochastically), perhaps actual output (with verbatim errors/warnings) versus intended output. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans Sep 23 '22 at 13:40
  • Without the data as @r2evens already mentioned, I can only give you a few ideas. If the column names are correct, then drop the data frame name from the call to count (use `DLY_TM_DEP`, not `leg_data$`). Additionally, the "/n" shouldn't be there. If you want to keep the data grouped, add the creation of `share` to `summarise` and drop the call to `mutate`. If you want to group temporarily, but want to keep all of the data, add the creation of `n` to `mutate` and drop the call to `summarise`. `summarise` and `mutate` do the same thing, essentially. However, one keeps all data & the other groups – Kat Sep 23 '22 at 14:51
  • Hi @Kat and r2even, thank you for your answer I updated my questions by putting more details, can this help to guide me? Sonia – Sony Sep 24 '22 at 17:36

0 Answers0