In my dataset, I have counted the number of trips that started from bicycles stations per hour (0-23), for about 200 stations.
> head(test)
Start.station.number hour number
1 31000 0 16
2 31000 1 1
3 31000 2 7
4 31000 3 1
5 31000 4 2
6 31000 5 12
My goal is to get the top ten stations per hour, however, it is likely that many some high popularity stations will have an equal count of trips per hour. For example, if station B, station A, and station C all have the same number of trips and are tied for 10th place at 7 AM, then one of them should be chosen randomly, it doesn't really matter which one is chosen. In R, how would one take the top ten stations by hour, regardless of the some stations having equal counts (this would be a similar function to head(n=10)
per hour) ?
Below is a sample of the data
> dput(test)
structure(list(Start.station.number = c(31000L, 31000L, 31000L,
31000L, 31000L, 31000L, 31000L, 31000L, 31000L, 31000L, 31000L,
31000L, 31000L, 31000L, 31000L, 31000L, 31000L, 31000L, 31000L,
31000L, 31000L, 31000L, 31000L, 31000L, 31001L, 31001L, 31001L,
31001L, 31001L, 31001L, 31001L, 31001L, 31001L, 31001L, 31001L,
31001L, 31001L, 31001L, 31001L, 31001L, 31001L, 31001L, 31001L,
31001L, 31001L, 31001L, 31001L, 31002L, 31002L, 31002L), hour = c(0L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 0L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L,
19L, 20L, 21L, 22L, 23L, 0L, 1L, 2L), number = c(16L, 1L, 7L,
1L, 2L, 12L, 27L, 33L, 36L, 41L, 50L, 36L, 39L, 34L, 22L, 38L,
40L, 27L, 37L, 31L, 16L, 15L, 16L, 8L, 3L, 3L, 1L, 1L, 2L, 15L,
30L, 74L, 47L, 49L, 40L, 43L, 54L, 51L, 56L, 48L, 99L, 75L, 48L,
28L, 24L, 14L, 3L, 16L, 18L, 3L)), .Names = c("Start.station.number",
"hour", "number"), row.names = c(NA, 50L), class = "data.frame")