0

i want to subset a data frame and take all observations for each id until the first observation that didn't meet my condition. Something like this:

goodDaysAfterTreatMent <- subset(Patientdays, treatmentDate < date & goodThings > badThings)

Except that this returns all observations that meet the condition. I want something that stops with the first observation that didn't meet the condition, moves on to the next id, and returns all observations for this id that meets the condition, and so on.

the only way i can see is to use a lot of loops but loops and that's usually not a god thing.

Hope you guys have an idea

2 Answers2

1

Assume that your condition is to return rows where v < 5 :

# example dataset
df = data.frame(id = c(1,1,1,1,2,2,2,2,3,3,3),
                v = c(2,4,3,5,4,5,6,7,5,4,1))

df

#    id v
# 1   1 2
# 2   1 4
# 3   1 3
# 4   1 5
# 5   2 4
# 6   2 5
# 7   2 6
# 8   2 7
# 9   3 5
# 10  3 4
# 11  3 1

library(tidyverse)

df %>%
  group_by(id) %>%          # for each id
  mutate(flag = cumsum(ifelse(v < 5, 1, NA))) %>%  # check if v < 5 and fill with NA all rows when condition is  FALSE and after that
  filter(!is.na(flag)) %>%  # keep only rows with no NA flags
  ungroup() %>%             # forget the grouping
  select(-flag)             # remove flag column


# # A tibble: 4 x 2
#      id     v
#   <dbl> <dbl>
# 1     1     2
# 2     1     4
# 3     1     3
# 4     2     4     
AntoniosK
  • 15,991
  • 2
  • 19
  • 32
  • what i want is something a bit diffrent. this just gives all values that meets a condition. I want all observations until the first FALSE and then move to the next ID and so on. So if the conditon in the data set is TRUE, TRUE, FALSE, TRUE, TRUE. The code should read this as TRUE, TRUE. Because only 2 TRUE before the first false. This want done for all of the ID's in my data frame. so an ID could have 10 TRUE, while the next could have 2 TRUE and so on. i trying to see how many contentious good days a patient has had after a certain point. – William Parker Aug 06 '18 at 09:07
  • Yes, I know. That's why for id = 3 you don't get any rows. Because the first is 5 no matter what's next. Try to change the v values and see how the process changes. – AntoniosK Aug 06 '18 at 09:10
  • Maybe my example doesn't make it very clear how it works :) I'll try to update the values later to make it clearer, but I think you can experiment yourself. But you can see that if the process returned all rows that meet a condition then it should have returned rows for id = 3. – AntoniosK Aug 06 '18 at 09:15
  • Arh okay i see, sorry i didn't see ID = 3. Sorry – William Parker Aug 06 '18 at 09:20
0

Easy way:

Find First FALSE by (min(which(condition == F)):

Patientdays<-cbind.data.frame(treatmentDate=c(1:5,4,6:10),date=c(2:5,3,6:10,10),goodThings=c(1:11),badThings=c(0:10))
attach(Patientdays)# Just due to ease of use (optional)

condition<-treatmentDate < date & goodThings > badThings

Patientdays[1:(min(which(condition == F))-1),]

Edit: Adding result.

  treatmentDate date goodThings badThings
1             1    2          1         0
2             2    3          2         1
3             3    4          3         2
4             4    5          4         3 
Iman
  • 2,224
  • 15
  • 35
  • Cool thank you, however this only gives me first false. However i would like to have all the values for each ID up until first false. I want to go through my entire dataframe with a lot of observations and several observations for each ID (person). i want all the observations for each ID until the first false (the bit that you did), when the first false is meet the (loop?) goes on the next ID and finds all observations until the first false. then next id, all observations until first false and so on. – William Parker Aug 06 '18 at 08:35
  • @ParkerWilliam please read https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and elaborate your question by providing data and example. – Iman Aug 06 '18 at 09:09