0

I have a data.table or a data.frame

library(data.table)
DT <- data.table(id = 1:9, name= rep(c('b','a','c'), each = 3))

where the column name is manually ordered, but always grouped. How can I calculate the name_ordercolumn to achieve the result below in either Data.table or dplyr?

   id  name  name_order
1:  1     b     1
2:  2     b     1
3:  3     b     1
4:  4     a     2
5:  5     a     2
6:  6     a     2
7:  7     c     3
8:  8     c     3
9:  9     c     3 
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Rickard
  • 3,600
  • 2
  • 19
  • 22

1 Answers1

-1

We can use match which can applied in both dplyr and data.table

DT[, names_order := match(name, unique(name))]

Or using dplyr

library(dplyr)
DT %>%
   mutate(names_order = match(name, unique(name)))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • The dplyr solution will return wrong output – David Arenburg Jan 18 '17 at 11:52
  • @DavidArenburg Thanks for checking that. I wrapped it with `sort` – akrun Jan 18 '17 at 11:54
  • 2
    This is also wrong. This only works because there is an even number of rows in each group. It will fail in any other case. For instance `DT <- data.table(id = 1:10, name= c("b", rep(c('b','a','c'), each = 3)))`. In eitherway, I don't see what your answers adds to the overall SO data base. If you have a solution that isn't present in the dupe target, it's better add it to there. – David Arenburg Jan 18 '17 at 11:58
  • @DavidArenburg But, I also showed two other solutions which will work for your example – akrun Jan 18 '17 at 12:00
  • @DavidArenburg The one you duped, doesn't have the 'group_indices', so I am keeping it here – akrun Jan 18 '17 at 12:04
  • 1
    But `group_indices` gives the wrong result – David Arenburg Jan 18 '17 at 12:06
  • @DavidArenburg I don't know about why it is behaving like that (may be bug), anyway, one approach is the `factor/integer` coercsion – akrun Jan 18 '17 at 12:11
  • I think it does some ranking first – David Arenburg Jan 18 '17 at 12:17