Create iterator on DF based on another column

Question

I have a df like this:

I need a new column C with an iterator which counts the number of occurences of value in column B.

This is what exactly I need:

    A      B   C
    0      0   1
    0      0   2
    0      0   3
    0      1   1
    0      1   2
    0      2   1
    0      3   1
    0      3   2
    1      0   1 
    1      0   2
    1      1   1
    1      1   2
    2      0   1
    2      1   1
    2      2   1

First 3 rows of C are 1-2-3 beacause in B we have 3 rows with value 0, then 2 rows of C with 1-2 beacause we have two rows with value 1 in B, etc...

I tried with something like this:

 DF$C <- ifelse(DF$B == 0 , 1:length(DF),1:length(DF))

But actually it doesn't work with more value than 0, and can't control it quite well. I need some for loop that checks col B and create col C iterating it.

Hope the question is clear. Thank you in advance.

Does this answer your question? [Numbering rows within groups in a data frame](https://stackoverflow.com/questions/12925063/numbering-rows-within-groups-in-a-data-frame) — Andrew, Mar 04 '20 at 13:38

score 3 · Accepted Answer · answered Mar 04 '20 at 13:19

You can use run length encoding (rle) to get the lengths of consecutive matches, then just seq each length in an lapply before unlisting it.

DF$C <- unlist(lapply(rle(DF$B)$lengths, seq))

DF
#>    A B C
#> 1  0 0 1
#> 2  0 0 2
#> 3  0 0 3
#> 4  0 1 1
#> 5  0 1 2
#> 6  0 2 1
#> 7  0 3 1
#> 8  0 3 2
#> 9  1 0 1
#> 10 1 0 2
#> 11 1 1 1
#> 12 1 1 2
#> 13 2 0 1
#> 14 2 1 1
#> 15 2 2 1

Sotos · Answer 2 · 2020-03-04T14:08:42.867

We can create groups based on the diff not being 0 (i.e. values are the same) and use those groups to create sequences, i.e.

i1 <- cumsum(c(TRUE, diff(df$B) != 0))
ave(i1, i1, FUN = seq_along)
#[1] 1 2 3 1 2 1 1 2 1 2 1 2 1 1 1

However, If your groups are based on both columns (you do not mention anything about column A), then we don't have to create the groups manually. We can just use both columns for grouping, i.e.

with(df, ave(A, A, B, FUN = seq_along))
#[1] 1 2 3 1 2 1 1 2 1 2 1 2 1 1 1

score 1 · Answer 3 · answered Mar 04 '20 at 17:51

With data.table, we can use rleid with rowid

library(data.table)
setDT(DF)[, C := rowid(rleid(B))]
DF
#    A B C
# 1: 0 0 1
# 2: 0 0 2
# 3: 0 0 3
# 4: 0 1 1
# 5: 0 1 2
# 6: 0 2 1
# 7: 0 3 1
# 8: 0 3 2
# 9: 1 0 1
#10: 1 0 2
#11: 1 1 1
#12: 1 1 2
#13: 2 0 1
#14: 2 1 1
#15: 2 2 1

data

DF <- structure(list(A = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L), B = c(0L, 0L, 0L, 1L, 1L, 2L, 3L, 3L, 0L, 
0L, 1L, 1L, 0L, 1L, 2L)), class = "data.frame", row.names = c(NA, 
-15L))

Create iterator on DF based on another column

3 Answers3

data