How to Use Rank Function in R (using dplyr)

Question

I have a data table called prob72. I want to add a column for rank. I want to rank each row by frac_miss_arr_delay. The highest value of frac_miss_arr_delay should get rank 1 and the lowest value should get the highest ranking (for my data that is rank 53). frac_miss_arr_delay are decimal values all less than 1. When I use the following line of code it ranks every single row as "1"

          prob72<- prob72 %>% mutate(rank=rank(desc(frac_miss_arr_delay), ties.method = "first"))

I've tried using row_number as well

           prob72<- prob72 %>% mutate(rank=row_number())

This STILL outputs all "1s" in the rank column.

     week arrDelayIsMissi~     n n_total frac_miss_arr_d~
     <dbl> <lgl>            <int>   <int>            <dbl>
      1    6. TRUE              1012    6101           0.166 
      2   26. TRUE               536    6673           0.0803
      3   10. TRUE               518    6549           0.0791
      4   50. TRUE               435    6371           0.0683
      5   49. TRUE               404    6398           0.0631
      6   21. TRUE               349    6285           0.0555


                     prob72[6]
                     # A tibble: 53 x 1
                      rank
                      <int>
                       1     1
                       2     1
                       3     1
                       4     1
                       5     1
                       6     1
                       7     1
                       8     1
                       9     1
                      10     1
                       # ... with 43 more rows

             flights_week = mutate(flights, week=lubridate::week(time_hour))

              prob51<-flights_week %>% 
               mutate(pos_arr_delay=if_else(arr_delay<0,0,arr_delay))
                prob52<-prob51 %>% group_by(week) %>% mutate(avgDelay = 
                mean(pos_arr_delay,na.rm=T))


                     prob52 <- prob52 %>% mutate(ridic_late=TRUE)
                     prob52$ridic_late<- ifelse(prob52$pos_arr_delay>prob52$avgDelay*10,TRUE, FALSE)
                      prob53<- prob52 %>% group_by(week) %>% count(ridic_late) %>% arrange(desc(ridic_late))
                     prob53<-prob53 %>% filter(ridic_late==TRUE)
                        prob54<- prob52 %>% group_by(week) %>% count(n())

                        colnames(prob53)[3] <- "n_ridiculously_late"
                      prob53["n"] <- NA
                    prob53$n <- prob54$n


                       table5 = subset(prob53, select=c(week,n, n_ridiculously_late))


                               prob71 <- flights_week


prob72 <- prob71 %>% group_by(week) %>% count(arrDelayIsMissing=is.na(arr_delay)) %>% arrange(desc(arrDelayIsMissing)) %>% filter(arrDelayIsMissing==TRUE)
prob72["n_total"] <- NA
prob72$n_total<- table5$n
prob72<-prob72 %>% mutate(percentageMissing = n/n_total)
prob72<-prob72 %>% arrange(desc(percentageMissing))

colnames(prob72)[5]="frac_miss_arr_delay"

Do you have any groups set on this object? Check `groups(prob72)`. When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. — MrFlick, Oct 01 '18 at 21:40
`mutate(mtcars, r = rank(disp, ties.method="first"))` worked for me. If MrFlick's comment doesn't help you, then it would help us if you provide a sample of your data, perhaps with `dput(prob72)` (or `dput(head(prob72,n=20))` if large). — r2evans, Oct 01 '18 at 21:41
week arrDelayIsMissi~ n n_total frac_miss_arr_d~ 1 6. TRUE 1012 6101 0.166 2 26. TRUE 536 6673 0.0803 3 10. TRUE 518 6549 0.0791 4 50. TRUE 435 6371 0.0683 5 49. TRUE 404 6398 0.0631 6 21. TRUE 349 6285 0.0555 — NICE8x, Oct 01 '18 at 21:46
@MrFlick I added the output...I can add the code I used to get to the table prob72 but it's a lot — NICE8x, Oct 01 '18 at 21:51
If you are currently grouping by week and there is only one value per week, then that's not going to help you with rank. So you want to rank across all rows in the table? Then use `ungroup()` to remove the grouping first. (What you've posted really isn't helpful because it's not reproducible and it's missing the useful header. See the link originally provided for making reproducible examples.) — MrFlick, Oct 01 '18 at 21:53
@MrFlick I just added all of my code. I use table5 to create prob72 so I included that code too. — NICE8x, Oct 01 '18 at 21:56
@MrFlick I ungrouped and now it works omg thank you so much so much so much you are the best omg thanks — NICE8x, Oct 01 '18 at 21:57

How to Use Rank Function in R (using dplyr)

0 Answers0