1

I have a dataframe in which patients have multiple observations of medication use over time. Some patients have consistently used medication, others have gaps, while I am trying to count the patients which have never used medication.

I can't show the actual data but here is an example data frame of what I am working with.

 patid meds
     1    0
     1    1
     1    1
     2    0
     2    0
     3    1
     3    1
     3    1
     4    0
     5    1
     5    0

So from this two patients (4 and 2) never used medication. That's what I'm looking for.

I'm fairly new to R and have no idea how to do this, any would be appreciated.

bpg
  • 27
  • 3

3 Answers3

2

Here is another alternative from dplyr package.

library(dplyr)
df <- data.frame(patid = c(1,1,1,2,2,3,3,3,4,5,5),
                 meds = c(0,1,1,0,0,1,1,1,0,1,0))
df %>% 
    distinct(patid, meds) %>% 
    arrange(desc(meds))%>% 
    filter(meds == 0 & !duplicated(patid))
#   patid meds
#1     2    0
#2     4    0 
Sri Sreshtan
  • 535
  • 3
  • 12
1

Try this:

library(dplyr)

#Data
df <- structure(list(patid = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L, 
5L, 5L), meds = c(0L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L)), class = "data.frame", row.names = c(NA, 
-11L))

#Code
df %>% group_by(patid) %>% summarise(sum=sum(meds,na.rm=T)) %>% filter(sum==0)

# A tibble: 2 x 2
  patid   sum
  <int> <int>
1     2     0
2     4     0
Duck
  • 39,058
  • 13
  • 42
  • 84
1

A Base R solution could be

subset(aggregate(meds ~ patid, df, sum), meds == 0)

which returns

  patid meds
2     2    0
4     4    0
Martin Gal
  • 16,640
  • 5
  • 21
  • 39