Consecutive Across and Unique Number Within Group

Question

I have a data frame, which looks like this:

DF_A <- data.frame(
  Group_1 = c("A", "A", "A", "A", "A", "B", "B", "B", "B", "C"),
  Group_2 = c("A", "B", "C", "A", "B", "A", "B", "A", "C", "A")
)

I would like to assign a consecutive number for Group_1 IDs which should be unique for the case of identical Group_2 IDs. For example, A+A starts with 1, A+B proceeds with 2 (same Group_1 ID, but new Group_2 ID), ..., A+A is again 1 (obviously a repetition). B+A is 1 (new Group_1 ID), ..., B+A (same Group_1 ID, but new Group_2 ID)...and so forth.

The result should look like this.

DF_B <- data.frame(
  Group_1 = c("A", "A", "A", "A", "A", "B", "B", "B", "B", "C"),
  Group_2 = c("A", "B", "C", "A", "B", "A", "B", "A", "C", "A"),
  ID      = c(1, 2, 3, 1, 2, 1, 2, 1, 1, 1)
)

I investigated various posts on corresponding approaches such as single groups within groups, or a combination - without any success - this case is not covered by previous posts.

Thank you in advance.

you mean create `factor` out of combinations of Group1 and Group2? row 9 should have ID=3? — chinsoon12, Feb 15 '18 at 02:48
A number, yes. The result is not a factor. One might consider to 'create' a factor value as intermediate step. — Dan, Feb 15 '18 at 02:54

score 2 · Accepted Answer · answered Feb 15 '18 at 03:08

One way to do it with ave is

DF_A$ID <- ave(DF_A$Group_2, DF_A$Group_1, FUN = function(x) match(x, unique(x)))

DF_A
#   Group_1 Group_2 ID
#1        A       A  1
#2        A       B  2
#3        A       C  3
#4        A       A  1
#5        A       B  2
#6        B       A  1
#7        B       B  2
#8        B       A  1
#9        B       C  3
#10       C       A  1

The equivalent dplyr way is :

library(dplyr)
DF_A %>%
  group_by(Group_1) %>%
  mutate(ID = match(Group_2, unique(Group_2)))

Thanks. Your answer work best for me. However, all other answers might be valuable for further applications. Thanks again. — Dan, Feb 15 '18 at 03:26

score 1 · Answer 2 · answered Feb 15 '18 at 02:56

1

You can split into groups by Group_1, then create factor out of your combinations within each group then convert into integer

DF_A$ID <- unlist(by(DF_A, DF_A$Group_1, function(x) as.integer(factor(x$Group_2))))

answered Feb 15 '18 at 02:56

chinsoon12

25,005
4
25
35

score 1 · Answer 3 · answered Feb 15 '18 at 03:02

We can use the dense_rank from dplyr.

library(dplyr)

DF_A2 <- DF_A %>%
  group_by(Group_1) %>%
  mutate(ID = dense_rank(Group_2)) %>%
  ungroup()
DF_A2
# # A tibble: 10 x 3
#    Group_1 Group_2    ID
#    <fct>   <fct>   <int>
#  1 A       A           1
#  2 A       B           2
#  3 A       C           3
#  4 A       A           1
#  5 A       B           2
#  6 B       A           1
#  7 B       B           2
#  8 B       A           1
#  9 B       C           3
# 10 C       A           1

Rich Scriven · Answer 4 · 2018-02-15T03:36:28.113

1

You could use the integer values of the factor levels. We can simply wrap Group_2 in c() to drop the factor attribute.

within(DF_A, { ID = ave(c(Group_2), Group_1, FUN = c) })
#   Group_1 Group_2 ID
# 1        A       A  1
# 2        A       B  2
# 3        A       C  3
# 4        A       A  1
# 5        A       B  2
# 6        B       A  1
# 7        B       B  2
# 8        B       A  1
# 9        B       C  3
# 10       C       A  1

edited Feb 15 '18 at 03:36

answered Feb 15 '18 at 03:15

Rich Scriven

97,041
11
181
245

1

even shorter with `within`? `within(DF_A, ID <- ave(c(Group_2), Group_1, FUN = c))` – chinsoon12 Feb 15 '18 at 03:17
@chinsoon12 - most definitely! – Rich Scriven Feb 15 '18 at 03:20

Consecutive Across and Unique Number Within Group

4 Answers4

Linked

Related