0

Consider the following table for example:

Group<-c("AGroup", "AGroup", "AGroup", "AGroup", "BGroup", "BGroup", "BGroup", "BGroup", "CGroup", "CGroup", "CGroup", "CGroup")
Status<-c("Low", "Low", "High", "High", "High", "Low", "High", "Low", "Low", "Low", "High", "High")

df<-data.frame(Group, Status)

df$CountByGroup<-c(1, 2, 1, 2, 1, 1, 1, 1, 1, 2, 1, 2)

This creates the following table:

Group   Status  CountByGroup
AGroup  Low     1
AGroup  Low     2
AGroup  High    1
AGroup  High    2
BGroup  High    1
BGroup  Low     1
BGroup  High    1
BGroup  Low     1
CGroup  Low     1
CGroup  Low     2
CGroup  High    1
CGroup  High    2

The CountByGroup column is what I am trying to create. Here you can see that "Low" appeared once so far for the "AGroup" in the first row, so it has an entry of 1. "Low" directly follows the same entry "Low" in the second row, so it has an entry of 2. If it were to appear a third time in a row in the third row, CountByGroup would display an entry of 3.

We're also grouping these "Group", so the first entry for a new group is always 1 since it is the first time any entry has appeared for the group.

GM01
  • 237
  • 1
  • 4
  • 1
    @GregorThomas the rownumber given by Mel G is not correct. OP is not interested in rownumbers. eg check rows 7 and 8. The code above gives 2 as the answer while OP needs the 1 – Onyambu Nov 16 '22 at 00:17
  • 1
    @MelG that code is not correct. – Onyambu Nov 16 '22 at 00:17
  • 2
    Oh - it seems the order matters here. So as @onyambu points out, `BGroup` and `High` appear on lines `5` and `7` but because they are not adjacent they are each scored as `1` in the `CounByGroup` column. – Dan Adams Nov 16 '22 at 00:25

1 Answers1

3

You could use data.table:

library(data.table)

setDT(df)[, CountByGroup := rowid(Group, rleid(Status))]
df
     Group Status CountByGroup
 1: AGroup    Low            1
 2: AGroup    Low            2
 3: AGroup   High            1
 4: AGroup   High            2
 5: BGroup   High            1
 6: BGroup    Low            1
 7: BGroup   High            1
 8: BGroup    Low            1
 9: CGroup    Low            1
10: CGroup    Low            2
11: CGroup   High            1
12: CGroup   High            2
Onyambu
  • 67,392
  • 3
  • 24
  • 53
  • `data.table::rleid()` is a great function to know about, thanks! For more info on combining this with {tidyverse} workflows, see [this](https://stackoverflow.com/questions/33507868/is-there-a-dplyr-equivalent-to-data-tablerleid) thread. – Dan Adams Nov 16 '22 at 00:30