-1

I'm trying to group some messy time series data together based on a value column - essentially I'm trying to create a function that will produce the column targetid - that is, the dataset is grouped by id, and a new id is created whenever a non-zero value begins again.

a <-  data.frame(
  id=rep(1:2,each=8,times=1),
  valuecolumn = c(5,5,10,0,0,0,5,0,5,5,0,5,10,0,0,0),
  targetid = c(1,1,1,1,1,1,2,2,1,1,1,2,2,2,2,2)
)

This answer was probably closest I could find (doesn't work as id resets every non-zero value.

SlyGrogger
  • 317
  • 5
  • 16

1 Answers1

0

Thought I would post an answer to my rather specific question:

library(dplyr)
a2 <- a %>%
  group_by(id) %>%
  mutate(next.valuecolumn = lag(valuecolumn),
         next.valuecolumn2 = coalesce(next.valuecolumn, valuecolumn),
         diff = ifelse(valuecolumn > 0 & next.valuecolumn2 == 0, 1, 0),
         target2 = cumsum(diff)+1)

The row id doesn't 'reset', but this is not required for the problem as I can group by user_id-target to sum value by id.

SlyGrogger
  • 317
  • 5
  • 16