2

I have a number of trials where one variable increases to a max of interest then decreases back to a starting point. How would I go about just retaining the observations with the increasing values to max. Thanks.

For example

Trial A B C
    1 2 4 1
    1 4 3 2
    1 3 7 3
    1 3 3 2
    1 4 1 1
    2 4 1 1
    2 6 2 2
    2 3 1 3
    2 1 1 2
    2 7 3 1
    ...

So we would check max on C and retain as follows,

Trial A B C
    1 2 4 1
    1 4 3 2
    1 3 7 3
    2 4 1 1
    2 6 2 2
    2 3 1 3
    ...

Ultimately I'll have a low cut off value as well as varying perhaps what I mean by max but essentially the above is the aim.

ksing
  • 67
  • 5
  • 2
    You could start with a reproducible example and desired output illustration – David Arenburg Jul 22 '15 at 18:54
  • Yeah, I should have...but thought this might be one of those very simple examples. I'll see about editing. – ksing Jul 22 '15 at 18:58
  • 1
    Try `df[c(TRUE,diff(df$C)>=0),]`. – nicola Jul 22 '15 at 19:05
  • @DavidArenburg, yes, I think you are right. – nicola Jul 22 '15 at 19:06
  • @nicola I guess this won't work if the beginning if the first incident of the second group is smaller than the last one in the first. Not sure if this possible in the real data though – David Arenburg Jul 22 '15 at 19:11
  • @DavidArenburg Yes, that's my same thought. I actually don't have time now to fix this problem; if you or other want to use my comment as a base for an answer, feel free of course. – nicola Jul 22 '15 at 19:14
  • Maybe `df[ave(df$C,df$Trial,FUN=function(x) c(TRUE,diff(x)>0))==1,]`? Ok, now I really need to leave and can't test with other sets of data. – nicola Jul 22 '15 at 19:17

2 Answers2

3

Probably not the most efficient solution, but here is an attempt using data.table

library(data.table)
setDT(df)[, .SD[1:which.max(C)], by = Trial]
#    Trial A B C
# 1:     1 2 4 1
# 2:     1 4 3 2
# 3:     1 3 7 3
# 4:     2 4 1 1
# 5:     2 6 2 2
# 6:     2 3 1 3

Or for some efficiency gain

indx <- setDT(df)[, .I[1:which.max(C)], by = Trial]
df[indx$V1]
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
-1
library(dplyr)
df%>%group_by(Trial)%>%slice(1:max(C))
Shenglin Chen
  • 4,504
  • 11
  • 11