5

I have the following data frame:

> test = data.frame(A = sample(1:5, 10, replace = T)) %>% arrange(A)
> test

   A
1  1
2  1
3  1
4  2
5  2
6  2
7  2
8  4
9  4
10 5

I now want every row to have an ID that is only incremented when the value of A changes. This is what I have tried:

> test = test %>% mutate(id = as.numeric(rownames(test))) %>% group_by(A) %>% mutate(id = min(id))
> test

       A    id
   (int) (dbl)
1      1     1
2      1     1
3      1     1
4      2     4
5      2     4
6      2     4
7      2     4
8      4     8
9      4     8
10     5    10

However, I would like to get the following:

       A    id
   (int) (dbl)
1      1     1
2      1     1
3      1     1
4      2     2
5      2     2
6      2     2
7      2     2
8      4     3
9      4     3
10     5     4
aseipel
  • 728
  • 7
  • 15
  • 1
    Is your A column always increasing? Does "when the value of A changes" mean that for A = 3 3 4 3, you want 1 1 2 3, for example? – Frank Feb 12 '16 at 19:02
  • Initially, my A column is unsorted, but I will sort it before assigning the IDs. So A = 3 3 4 3 should be 1 1 2 1 because I will sort it to be A = 3 3 3 4 – aseipel Feb 12 '16 at 19:15
  • Ok, thanks for clarifying. Let us know if the linked question (in a banner at the top, if you refresh the page) doesn't work for you. – Frank Feb 12 '16 at 19:16

2 Answers2

6
library(dplyr)

test %>% mutate(id = dense_rank(A))
davechilders
  • 8,693
  • 2
  • 18
  • 18
5

One compact option would be using data.table. Convert the 'data.frame' to 'data.table' (setDT(test)), grouped by 'A', we assign (:=) .GRP as the new 'id' column. The .GRP will be a sequence of values for each unique value in 'A'.

library(data.table)
setDT(test)[, id:=.GRP, A]

In case the value of 'A' changes like 3, 3, 4, 3 and we want 1, 1, 2, 3 forthe 'id'

setDT(test)[, id:= rleid(A)]

Or we convert 'A' to factor class and then coerce it back to numeric/integer

library(dplyr)
test %>%
    mutate(id = as.integer(factor(A)))

Or we can match 'A' with the unique values in 'A'.

test %>%
     mutate(id = match(A, unique(A)))

Or from the dplyr version > 0.4.0, we can use group_indices (it is in the dupe link)

test %>%
      mutate(id=group_indices_(test, .dots= "A"))
akrun
  • 874,273
  • 37
  • 540
  • 662