Sequence with repeating numbers

Question

Data

I have a data.frame that looks something like this:

df <- data.frame(id = c(1:10),
                 color = c(rep("red", 5), rep("blue", 5)))
df
#>    id color
#> 1   1   red
#> 2   2   red
#> 3   3   red
#> 4   4   red
#> 5   5   red
#> 6   6  blue
#> 7   7  blue
#> 8   8  blue
#> 9   9  blue
#> 10 10  blue

Expected result

I'm trying to create a new column, say pair that assigns a pair ID to each group of 2 consecutive IDs. For example, I want to end with a data.frame that looks like:

df
#>    id color pair
#> 1   1   red    1
#> 2   2   red    1
#> 3   3   red    2
#> 4   4   red    2
#> 5   5   red    3
#> 6   6  blue    3
#> 7   7  blue    4
#> 8   8  blue    4
#> 9   9  blue    5
#> 10 10  blue    5

Current method

All I'm wondering is whether there's a more concise way to achieve this, than what I'm already doing. I have looked through the seq() documentation without any luck, though. Here is what I have currently, which gives me the desired output but is not very succinct.

df %>% 
  dplyr::mutate(pair = sort(rep(seq(length.out = nrow(df)/2),2)))

#     id  color   pair
# 1   1   red    1
# 2   2   red    1
# 3   3   red    2
# 4   4   red    2
# 5   5   red    3
# 6   6  blue    3
# 7   7  blue    4
# 8   8  blue    4
# 9   9  blue    5
# 10 10  blue    5

Does anyone have any ideas, or another function besides seq() that would do the job?

Shree · Answer 1 · 2019-06-17T17:36:22.000

3

Here's a simple with rep() from base R -

df$pair <- rep(1:nrow(df), each = 2, length.out = nrow(df))

df

   id color pair
1   1   red    1
2   2   red    1
3   3   red    2
4   4   red    2
5   5   red    3
6   6  blue    3
7   7  blue    4
8   8  blue    4
9   9  blue    5
10 10  blue    5

With dplyr -

df %>% 
  mutate(pair = rep(1:nrow(.), each = 2, length.out = nrow(.)))

edited Jun 17 '19 at 17:36

answered Jun 17 '19 at 17:31

Shree

10,835
1
14
36

score 1 · Answer 2 · answered Jun 17 '19 at 17:29

1

One possibility could be:

df %>%
 mutate(pair = gl(n()/2, 2))

   id color pair
1   1   red    1
2   2   red    1
3   3   red    2
4   4   red    2
5   5   red    3
6   6  blue    3
7   7  blue    4
8   8  blue    4
9   9  blue    5
10 10  blue    5

answered Jun 17 '19 at 17:29

tmfmnk

38,881
4
47
67

score 1 · Answer 3 · answered Jun 17 '19 at 17:30

1

We may use integer division,

(df$pair <- (1:nrow(df) - 1) %/% 2)
#  [1] 0 0 1 1 2 2 3 3 4 4

which also nicely generalizes to larger groups; e.g.,

(df$pair <- (1:nrow(df) - 1) %/% 3)
#  [1] 0 0 0 1 1 1 2 2 2 3

answered Jun 17 '19 at 17:30

Julius Vainora

47,421
9
90
102

akrun · Accepted Answer · 2019-06-17T17:37:37.337

1

Another option

library(dplyr)
df %>%
   mutate(pair = as.integer(gl(n(), 2, n())))
#    id color pair
#1   1   red    1
#2   2   red    1
#3   3   red    2
#4   4   red    2
#5   5   red    3
#6   6  blue    3
#7   7  blue    4
#8   8  blue    4
#9   9  blue    5
#10 10  blue    5

Or with rep and cumsum

df %>% 
    mutate(pair = cumsum(rep(c(TRUE, FALSE), length.out = n())))

Or much simpler case with base R

df$pair <- c(TRUE, FALSE)
df$pair <- cumsum(df$pair)

edited Jun 17 '19 at 17:37

answered Jun 17 '19 at 17:31

akrun

874,273
37
540
662

1

Thank you so much for your many solutions! I've noted them for future use. – Felix T. Jun 17 '19 at 18:52

Sequence with repeating numbers

Data

Expected result

Current method

4 Answers4

Linked

Related