0

I have a dataframe for multiple products and different date ranges. I want to assign unique value to each date so that even if the starting dates are different for various products, I can group by the dates.

df

acc product date
a1  p1      d1
a1  p1      d2
a1  p1      d3
a1  p1      d4
a1  p2      d1
a1  p2      d2
a1  p2      d3
a1  p3      d3
a1  p3      d4

I want to arrange the dates so that there is a unique identifier each for d1, d2, d3 etc.

I used the following code to try this:

df <- df %>% group_by(acc, product) %>% mutate(t = row_number())

Output

df

acc product date  t   EXPECTED
a1  p1      d1    1   1
a1  p1      d2    2   2
a1  p1      d3    3   3
a1  p1      d4    4   4
a1  p2      d1    1   1
a1  p2      d2    2   2
a1  p2      d3    3   3
a1  p3      d3    1   3
a1  p3      d4    2   4

Any suggestions for this?

hbabbar
  • 947
  • 4
  • 15
  • 33

1 Answers1

2

use dplyr::dense_rank()

df %>% mutate(new = dense_rank(date))
  acc product date new
1  a1      p1   d1   1
2  a1      p1   d2   2
3  a1      p1   d3   3
4  a1      p1   d4   4
5  a1      p2   d1   1
6  a1      p2   d2   2
7  a1      p2   d3   3
8  a1      p3   d3   3
9  a1      p3   d4   4

If however, you want to restart ranks for each acc use group_by before the mutate statement.

dput used

df <- structure(list(acc = c("a1", "a1", "a1", "a1", "a1", "a1", "a1", 
"a1", "a1"), product = c("p1", "p1", "p1", "p1", "p2", "p2", 
"p2", "p3", "p3"), date = c("d1", "d2", "d3", "d4", "d1", "d2", 
"d3", "d3", "d4")), class = "data.frame", row.names = c(NA, -9L
))
AnilGoyal
  • 25,297
  • 4
  • 27
  • 45