0

I am trying to order time stamps with dplyr and lubridate but I'm not getting my expected ordering.

library(lubridate);library(dplyr)

foo <- data.frame(time = ymd_hms(c("2016-08-31 13:40:00", "2016-08-31 06:40:00", "2016-08-31 10:40:00")), 
                      expected_order = c(3,1,2)) 
foo %>% mutate(dplyr_ordered = order(time))
cylondude
  • 1,816
  • 1
  • 22
  • 55

2 Answers2

3

You confused what the order and rank do, from ?order:

order returns a permutation which rearranges its first argument into ascending or descending order.

order does not return a rank of the actual value but an index vector which can be used to sort the vector, compare the following result:

foo %>% mutate(dplyr_order = order(time), dplyr_rank = rank(time))

#                  time expected_order dplyr_order dplyr_rank
# 1 2016-08-31 13:40:00              3           2          3
# 2 2016-08-31 06:40:00              1           3          1
# 3 2016-08-31 10:40:00              2           1          2

The result from rank is what you are expecting. The result from order tells you that the second element in time is smallest, followed by the third element and the first element is the largest.

Psidom
  • 209,562
  • 33
  • 339
  • 356
0

Nothing weird going on except that you're expecting the row numbers to be listed in dplyr_ordered.

foo$time
#> [1] "2016-08-31 13:40:00 UTC" "2016-08-31 06:40:00 UTC" "2016-08-31 10:40:00 UTC"

order(foo$time)
#> [1] 2 3 1

As expected, item 2 of foo$time is the first in order, then 3, then 1.

Jonathan Carroll
  • 3,897
  • 14
  • 34