0

I have a dataset of tests ("Dates_try") that I need to calculate statistics for by weekly time periods (81 weeks in total, whose start and stop dates are stored in a dataframe called "Testing_Dates"-- 81 rows with two variables, a start and stop date). I have a flag variable in Dates_try for each time period which flags an individual who tested positive in the current time period and a previous time period. Thus, I have 81 variables iteratively named Flag_Posbefored1-Flag_Posbefored81. I only want to summarize individuals who were tested within the time period that have a NA value or 0 value for the flag. How do I iteratively filter based on the Flag_Posbefored1:81 variables?

I didn't want to type out all of my flags, but here is an example of my Dates_try dataframe, where AlienID represents unique individuals, TestDate is the test date, Flag_Posbefored1 represents the flag for the first time period:

structure(list(AlienID = c(1289762, 1289792, 1289760, 1289774, 
1377879), TestDate = structure(c(19009, 19009, 19009, 19009, 
19109), class = "Date"), Flag_Posbefored1 = c(NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_), Flag_Posbefored2 = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_), Flag_Posbefored3 = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_)), row.names = c(NA, 5L
), class = "data.frame")

Testing_Dates looks like the following:

structure(list(Start.Date = structure(c(18587, 18594, 18601, 
18608, 18615), class = "Date"), End.Date = structure(c(18593, 
18600, 18607, 18614, 18621), class = "Date")), row.names = c(NA, 
5L), class = "data.frame")

I tried:

for(i in 1:81){
  
name<-paste("Flag_Posbefored", i, sep="")

PerPos_1<-Dates_try %>%

              filter ((TestDate>=Testing_Dates$Start.Date[i] & TestDate<=Testing_Dates$End.Date[i]) &  (!!name==0 | is.na(!!name)))}

But this resulted in an empty dataframe when executed above, but had rows in it when I replaced "name" with the actual names (e.g. i=1 and Flag_Posbefored1)

Almoss277
  • 1
  • 1
  • 3
    Please share a few rows of sample data so we can see what's going on. `dput()` is the nicest way to share sample data as it is copy/pasteable and includes all data structure info, e.g., `dput(Dates_try[1:5, ])` for the first 5 rows. Please also share a few relevant rows of `Testing_Dates` – Gregor Thomas Jun 05 '23 at 19:19
  • I've edited my original post with the output of dput() – Almoss277 Jun 05 '23 at 20:22
  • Not sure if I understood the issue but it looks like reshaping `Dates_try` to long format (using packages {dplyr} and {tidyr}) might be an easier way forward. Check whether the expression: `Dates_try |> pivot_longer(starts_with('Flag_Posbefore'))` creates a suitable dataframe for your filter requirements. – I_O Jun 05 '23 at 21:23
  • This works! Thanks! – Almoss277 Jun 06 '23 at 15:31

0 Answers0