3

I am struggling with one maybe easy question. I have a dataframe of 1 column with n rows (n is a multiple of 3). I would like to add a second column with integers like: 1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,.. How can I achieve this with dplyr as a general solution for different length of rows (all multiple of 3).

I tried this:

df <- tibble(Col1 = c(1:12)) %>% 
  mutate(Col2 = rep(1:4, each=3))

This works. But I would like to have a solution for n rows, each = 3 . Many thanks!

TarJae
  • 72,363
  • 6
  • 19
  • 66

3 Answers3

2

You can specify each and length.out parameter in rep.

library(dplyr)

tibble(Col1 = c(1:12)) %>% 
  mutate(Col2 = rep(row_number(), each=3, length.out = n()))

#    Col1  Col2
#   <int> <int>
# 1     1     1
# 2     2     1
# 3     3     1
# 4     4     2
# 5     5     2
# 6     6     2
# 7     7     3
# 8     8     3
# 9     9     3
#10    10     4
#11    11     4
#12    12     4
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
2

We can use gl

library(dplyr)
df %>%
     mutate(col2 = as.integer(gl(n(), 3, n())))
akrun
  • 874,273
  • 37
  • 540
  • 662
1

As integer division i.e. %/% 3 over a sequence say 0:n will result in 0, 0, 0, 1, 1, 1, ... adding 1 will generate the desired sequence automatically, so simply this will also do

df %>% mutate(col2 = 1+ (row_number()-1) %/% 3)

# A tibble: 12 x 2
    Col1  col2
   <int> <dbl>
 1     1     1
 2     2     1
 3     3     1
 4     4     2
 5     5     2
 6     6     2
 7     7     3
 8     8     3
 9     9     3
10    10     4
11    11     4
12    12     4
AnilGoyal
  • 25,297
  • 4
  • 27
  • 45