0

I have a data frame with drug abuse information. Each row is a treatment episode that contains the STFIPS code for the state treatment occured in and a column with a 1 or a 0 indicating completion of treatment or failure to complete respectively.

STFIPS OUTCOME
4 0
5 0
5 1
5 1

I am trying to find the percentage of 1's that occur for each state i.e. the percentage of completetion per each state (STFIPS) in the data frame. Ideally, this percent result would be in a seperate column...

Anybody got any ideas? Many thanks in advance!

  • From your description, this is a duplicate of the linked question/answer. To help cement it, I provided some hints in an answer below (you can choose to "accept" if you want), but the linked answer has significantly more information and a few more options. Future questions would really benefit from being *reproducible*, including sample data (such as the output from `dput(head(x,10))`), code attempted, errors/warning text, and expected output. Please see https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. Thanks! – r2evans Aug 16 '21 at 21:21

1 Answers1

0

If we look at mtcars, I'll focus on cyl (as the grouping variable) and vs and am (as the 0/1 variables).

base r

do.call(rbind, by(mtcars[,c("vs","am")], mtcars$cyl, FUN = function(z) sapply(z, sum) / nrow(z)))
#          vs        am
# 4 0.9090909 0.7272727
# 6 0.5714286 0.4285714
# 8 0.0000000 0.1428571

(You'll want to bring the cyl from the rownames(.) into a column in order to better match those below.)

dplyr

library(dplyr)
mtcars %>%
  group_by(cyl) %>%
  summarize(across(c(vs,am), ~ sum(.) / n()))
# # A tibble: 3 x 3
#     cyl    vs    am
#   <dbl> <dbl> <dbl>
# 1     4 0.909 0.727
# 2     6 0.571 0.429
# 3     8 0     0.143

data.table

library(data.table)
as.data.table(mtcars)[, lapply(.SD, function(z) sum(z) / .N), by = cyl, .SDcols = c("vs", "am")]
#      cyl        vs        am
#    <num>     <num>     <num>
# 1:     6 0.5714286 0.4285714
# 2:     4 0.9090909 0.7272727
# 3:     8 0.0000000 0.1428571
r2evans
  • 141,215
  • 6
  • 77
  • 149