1

I have a data frame like below

           doctor   user           Hour               weekday
1          d1         u1          07_08             Wednesday
2          d1         u2          07_08             Wednesday
3          d1         u2          07_08             Wednesday
4          d1         u2          07_08             Wednesday
5          d1         u3          07_08             Wednesday
6          d1         u3          07_08             Wednesday

I want get number of users group by doctor ,hour and weekday. I have used for loop for that but is there anyway to get with dplyr group by function?

I have tried like below:

for(i in 1:length(unique(d$doctor)){
     get unique doctors data
  for(j in 1:length(unique(d$weekday)){
         get weekday data for that doctor
      for(k in 1:length(unique(d$Hour)){
           h1=get hour data for that weekday and doctor
           no_of_users <- nrow(unique(h1$users))
}
}
}

Is there any way to do that without using loops? Thank you.

Navya
  • 307
  • 3
  • 15
  • is `get unique doctors data` etc suppose to be comments? can you add the desired output? – morgan121 Mar 03 '20 at 05:46
  • `d %>% count(doctor,hour,weekday,user)` – Rohit Mar 03 '20 at 05:49
  • get unique doctors data is a comment.I mean i have tried like above – Navya Mar 03 '20 at 05:59
  • What's wrong with the solution by Rohit? This should be the equivalent of doing `d %>% group_by(doctor,user,hour,weekday) %>% count()` which is what I would think the solution would be here – Ricky Mar 03 '20 at 06:01
  • @Rohit , It is giving separate counts for different users. I want to get in that hour ,how many unique users are present with respect to doctor and weekday. – Navya Mar 03 '20 at 06:05
  • Have you tried doing this? `d %>% group_by(doctor,hour,weekday) %>% count(user)` – Ricky Mar 03 '20 at 06:08
  • One more note on your for loop logic of only taking the `1:length(unique(d$doctor))` , this seems weird to me, why would you only take the length of the number of unique values? I would think you would want to iterate through every row of the dataset to get to your answer and not wrap it with the unique function, but dplyr is definitely better for this so I would stick to using `group_by()` and `count()` like I just outlined – Ricky Mar 03 '20 at 06:10
  • Why i used length means there are so many doctors in the data.I have to loop through for each doctor in the data – Navya Mar 03 '20 at 06:13
  • 1
    Try `d %>% group_by(doctor,hour,weekday) %>% summarise(n=length(unique(user)) )` – Rohit Mar 03 '20 at 06:14

1 Answers1

3
d %>%
  group_by(doctor, hour, weekday) %>%
  summarize(num_users = n_distinct(user))
user2332849
  • 1,421
  • 1
  • 9
  • 12