Count events in R per row depending on data range

Question

I have a dataset of 1000 patients (Patient 1-1000) undergoing the same type of procedure but by diffrent surgeons. I am interested to count the number of operations performed by each surgeon from study start date 27/06/2023 (for example) before the next procedure and insert this count/number per each patient/row. For example I need to know how many operations surgeon A performed on previous patients (1 and 2) before operating on patient 3 (e.g. 2 operation).Same for surgeon B etc.

I gues there is some formula in dplyr but I cannot get my head around it.

Image of dataset and what I need to get

Patient Surgeon Operation Date  Event before index (operation date)
1   A   28/06/2023  0
2   A   29/06/2023  1
3   A   30/06/2023  2
4   B   1/07/2023   0
5   C   2/07/2023   1
6   C   3/07/2023   2

Hi Daniel! What does your starting dataset look like? Please read the following post to make your question reproducible and get better, faster help: [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) — jpsmith, Jun 28 '23 at 17:08
This existing question covers a number of possible methods to perform this general concept: [Numbering rows within groups in a data frame](https://stackoverflow.com/questions/12925063/numbering-rows-within-groups-in-a-data-frame) The only difference for your case is you'll need to subtract 1 from your "counter" column. — Matt Summersgill, Jun 28 '23 at 17:22

Gregor Thomas · Answer 1 · 2023-06-28T17:26:41.323

It looks like your desired result is the row number for each surgeon minus one.

library(dplyr)

your_data |>
  mutate(event_before_index = row_number() - 1, .by = Surgeon)

This assumes your data is already sorted by date. If not, you can convert your date column to a proper Date class and sort it before running the above command.

library(lubridate)
your_data |>
  mutate(Operation_Date = dmy(Operation_Date)) |>
  arrange(Surgeon, Operation_Date) |>
  mutate(event_before_index = row_number() - 1, .by = Surgeon)

score -1 · Accepted Answer · answered Jun 28 '23 at 17:56

Please try the below code which can be give the Event before index (operation date) even if the patient is repeated operated by surgeon previously please check the new variable

however at some places it does not match i.e., highlighted with #

library(dplyr)

df_2 <- df %>% group_by(Surgeon,Patient) %>% 
  slice_tail(n=1) %>% group_by(Surgeon) %>% mutate(new=row_number()-1)

df_or <- df %>% 
  left_join(df_2 %>% select(Patient, Surgeon, new), by=c('Surgeon','Patient')) 


# output

# A tibble: 6 × 5
  Patient Surgeon Operation_Date Event_before   new
    <dbl> <chr>   <date>                <dbl> <dbl>
1       1 A       2023-06-28                0     0
2       2 A       2023-06-29                1     1
3       3 A       2023-06-30                2     2
4       4 B       2023-07-01                0     0
5       5 C       2023-07-02                1     0 #
6       6 C       2023-07-03                2     1 #

Thank you all for taking the time to reply. I will check the codes and let you know how I get on. — Daniel Fudulu, Jun 30 '23 at 12:29
Thank you @DanielFudulu for the confirmation and choosing it as an answer. — jkatam, Jun 30 '23 at 14:41

Count events in R per row depending on data range

2 Answers2