0

I have the following data:

ID date
1  18/08/2020
1  19/08/2020
1  17/08/2020
2  15/07/2020
2  14/07/2020
3  19/08/2020
3  18/08/2020
3  17/08/2020

I want to order the data by date within each ID and add a new column to show the order of dates for each ID. The data should look like this

ID date        order
1  18/08/2020   2
1  19/08/2020   3
1  17/08/2020   1
2  15/07/2020   2
2  14/07/2020   1
3  19/08/2020   3
3  18/08/2020   2
3  17/08/2020   1

How to do this? thank you very much.

Deep North
  • 150
  • 6

4 Answers4

2

You can use dense_rank from dplyr :

library(dplyr)

df %>%
  mutate(date = lubridate::dmy(date)) %>%
  group_by(ID) %>%
  mutate(order = dense_rank(date))

#     ID date       order
#  <int> <date>     <int>
#1     1 2020-08-18     2
#2     1 2020-08-19     3
#3     1 2020-08-17     1
#4     2 2020-07-15     2
#5     2 2020-07-14     1
#6     3 2020-08-19     3
#7     3 2020-08-18     2
#8     3 2020-08-17     1
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

Using a tidyverse approach:

library(dplyr)

df %>%
  mutate(date = as.POSIXct(strptime(date, format = "%d/%m/%y", tz = "GMT"))) %>%
  group_by(ID) %>%
  mutate(order = order(date))

#> # A tibble: 8 x 3
#> # Groups:   ID [3]
#>      ID date                order
#>   <int> <dttm>              <int>
#> 1     1 2020-08-18 00:00:00     3
#> 2     1 2020-08-19 00:00:00     1
#> 3     1 2020-08-17 00:00:00     2
#> 4     2 2020-07-15 00:00:00     2
#> 5     2 2020-07-14 00:00:00     1
#> 6     3 2020-08-19 00:00:00     3
#> 7     3 2020-08-18 00:00:00     2
#> 8     3 2020-08-17 00:00:00     1

Data

structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), date = structure(c(4L, 
5L, 3L, 2L, 1L, 5L, 4L, 3L), .Label = c("14/07/2020", "15/07/2020", 
"17/08/2020", "18/08/2020", "19/08/2020"), class = "factor")), 
class = "data.frame", row.names = c(NA, -8L))
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
1

This is easy using data.table.

If your data.frame is in df,

library(data.table)
setDT(df)
df[, order := rank(date), ID]
pseudospin
  • 2,737
  • 1
  • 4
  • 19
1

Using ave.

data$order <- with(data, ave(as.numeric(date), ID, FUN=rank))
data
#   ID       date order
# 1  1 2020-08-18     2
# 2  1 2020-08-19     3
# 3  1 2020-08-17     1
# 4  2 2020-07-15     2
# 5  2 2020-07-14     1
# 6  3 2020-08-19     3
# 7  3 2020-08-18     2
# 8  3 2020-08-17     1

Data

data <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), date = structure(c(18492, 
18493, 18491, 18458, 18457, 18493, 18492, 18491), class = "Date")), row.names = c(NA, 
-8L), class = "data.frame")
jay.sf
  • 60,139
  • 8
  • 53
  • 110