put sequence numbers in R

Question

I have a dataframe like a bellow

date    input   
....     org
....     Min 1
....     Min 1
....     Min 1
....     Min 2
....     Min 2
....     Min 3
....     org
....     org
....     Min 1
....     Min 2
....     Min 2
....     Min 3
....     Min 3
....     Min 4

And I want to add another column with a classification of the input like bellow

date    input      Number_input
....     org           1
....     Min 1         2
....     Min 1         2
....     Min 1         2
....     Min 2         3
....     Min 2         3
....     Min 3         4
....     org           5
....     org           5
....     Min 1         6
....     Min 2         7
....     Min 2         7
....     Min 3         8
....     Min 3         8
....     Min 4         9

Can help me? ;-)

Hi Abdel, it's difficult to know exactly what you're looking for. Perhaps you could reformat your examples, or better yet, provide the output of `dput()`? You can edit your question and paste the output. You can surround it with three backticks (```) for better formatting. See [How to make a reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for more info. — Ian Campbell, May 07 '20 at 14:15
thanks, @lan Campbell for formating my post to the correct format, i think now it's more clear what I'm searching to get — Abdel_El, May 07 '20 at 14:27

Matt · Answer 1 · 2020-05-07T14:42:27.770

With dplyr:

df %>%
  mutate(Number_input = rle(input)$lengths %>% 
  {rep(seq(length(.)), .)})

Which gives:

   date  input input_number
   <chr> <chr>        <int>
 1 ….    org              1
 2 ….    Min 1            2
 3 ….    Min 1            2
 4 ….    Min 1            2
 5 ….    Min 2            3
 6 ….    Min 2            3
 7 ….    Min 3            4
 8 ….    org              5
 9 ….    org              5
10 ….    Min 1            6
11 ….    Min 2            7
12 ….    Min 2            7
13 ….    Min 3            8
14 ….    Min 3            8
15 ….    Min 4            9

dput:

structure(list(date = c("….", "….", "….", "….", "….", "….", "….", 
"….", "….", "….", "….", "….", "….", "….", "…."), input = c("org", 
"Min 1", "Min 1", "Min 1", "Min 2", "Min 2", "Min 3", "org", 
"org", "Min 1", "Min 2", "Min 2", "Min 3", "Min 3", "Min 4")), row.names = c(NA, 
-15L), class = c("tbl_df", "tbl", "data.frame"))

Found the solution from @mpettis here: Increment by 1 for every change in column

GKi · Answer 2 · 2020-05-07T15:24:54.447

You can use diff by using the numbers from a cast to factor and create the cumsum:

cumsum(c(TRUE, diff(unclass(factor(x$input)))!=0))
# [1] 1 2 2 2 3 3 4 5 5 6 7 7 8 8 9

or you compare the shifted vectors of euqality:

cumsum(c(TRUE, x$input[-1] != x$input[-nrow(x)]))
# [1] 1 2 2 2 3 3 4 5 5 6 7 7 8 8 9

or using xtfrm instead of factor

cumsum(c(TRUE, diff(xtfrm(x$input))!=0))
# [1] 1 2 2 2 3 3 4 5 5 6 7 7 8 8 9

score 1 · Accepted Answer · answered May 07 '20 at 15:29

It seems you are looking for rleid() from data.table:

df$Number_input <- data.table::rleid(df$input)
df

   data input Number_input
1    ….   org            1
2    …. Min 1            2
3    …. Min 1            2
4    …. Min 1            2
5    …. Min 2            3
6    …. Min 2            3
7    …. Min 3            4
8    ….   org            5
9    ….   org            5
10   …. Min 1            6
11   …. Min 2            7
12   …. Min 2            7
13   …. Min 3            8
14   …. Min 3            8
15   …. Min 4            9

Reproducible data

df <- data.frame(
  data = "….",
  input = c(
    "org", "Min 1", "Min 1", "Min 1", "Min 2", "Min 2", "Min 3", 
    "org", "org", "Min 1", "Min 2", "Min 2", "Min 3", "Min 3", "Min 4"
  )
)

thanks @sindri_baldur it's more simple with rleid ;-) – Abdel_El May 08 '20 at 09:23 — Abdel_El, May 08 '20 at 09:23

put sequence numbers in R

3 Answers3