0

I have data in which the rows are ordered in chrono order of all shots taken in a game and the column "fg_result" is given a 1 for a made shot and a 0 for missed shot. I want to mutate a column in dplyr that returns the number of shots made in a row. I feel like a lag or lead function would be needed but am not sure. Here's a short piece of the df I'm working with. You'll notice only one miss (0) and the rest 1s. Ideally, the correct result for what I want is a new col (made_in_row) to equal 1, 2, 0, 1, 2,3,4

structure(list(gameid = c("ChicagoBoston19961101", "ChicagoBoston19961101", 
"ChicagoBoston19961101", "ChicagoBoston19961101", "ChicagoBoston19961101", 
"ChicagoBoston19961101", "ChicagoBoston19961101"), player_fg = c("D. Rodman", 
"D. Wesley", "M. Jordan", "E. Williams", "S. Pippen", "S. Pippen", 
"S. Pippen"), fg_result = c(1, 1, 0, 1, 1, 1, 1)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -7L), groups = structure(list(
gameid = "ChicagoBoston19961101", .rows = list(1:7)), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame"), .drop = FALSE))
Jeff Henderson
  • 643
  • 6
  • 10

1 Answers1

3

In data.table you can use rleid to create groups and assign row number in each group. We can set made_in_row to 0 where fg_result = 0.

library(data.table)
setDT(df)[, made_in_row := seq_len(.N), .(gameid, rleid(fg_result))]
df[fg_result == 0, made_in_row := 0]  

df
#                  gameid   player_fg fg_result made_in_row
#1: ChicagoBoston19961101   D. Rodman         1           1
#2: ChicagoBoston19961101   D. Wesley         1           2
#3: ChicagoBoston19961101   M. Jordan         0           0
#4: ChicagoBoston19961101 E. Williams         1           1
#5: ChicagoBoston19961101   S. Pippen         1           2
#6: ChicagoBoston19961101   S. Pippen         1           3
#7: ChicagoBoston19961101   S. Pippen         1           4

We can use the same logic in dplyr as well :

library(dplyr)
df %>%
  group_by(gameid, grp = rleid(fg_result)) %>%
  mutate(made_in_row  =row_number(), 
         made_in_row = replace(made_in_row, fg_result == 0, 0))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213