1

I'm dealing with some survey data and using dplyr, gtsummary, and the survey package to do my analysis.

One area of the survey goes like this:

  1. A question about whether the respondent would consider using the service
  2. For those who would not, they are presented with a list of potential reasons and asked to select the reasons why.

I've used a filter and select to focus my analysis only on this subset of questions (columns) among the people (rows) who said they are not likely. This reduces my initial N from 150 to 41 (41 non-considerers). So far so good, the code works.

The problem I'm having is that when I use tbl_svysummary(), the % displayed next to each question says 100% instead of X/41. I've tried using %>% add_n(statistic="{p_nonmiss}") on the tbl_svysummary, but it gives me the same result.

If it's helpful to know, I'm using tbl_svysummary instead of tbl_summary because I need to use the survey package to calibrate the data.

Here's what the output looks like: https://i.stack.imgur.com/6FLu7.png

And here is my code:


barriers <- df %>% filter(QCONSIDERATION == "Not likely" | QCONSIDERATION == "Not at all likely") %>% 
  
    select(QCONSIDERATIONSBAR_1, QCONSIDERATIONSBAR_6, QCONSIDERATIONSBAR_7, QCONSIDERATIONSBAR_8, QCONSIDERATIONSBAR_9, QCONSIDERATIONSBAR_10, QCONSIDERATIONSBAR_11, QCONSIDERATIONSBAR_12, QCONSIDERATIONSBAR_13, QCONSIDERATIONSBAR_14, QCONSIDERATIONSBAR_15, QCONSIDERATIONSBAR_16, QCONSIDERATIONSBAR_17, QCONSIDERATIONSBAR_18, QCONSIDERATIONSBAR_19, QCONSIDERATIONSBAR_20, QCONSIDERATIONSBAR_21,QCONSIDERATIONSBAR_22)

barrierssvy <- svydesign(id=~1, data=barriers)

tbl_svysummary(barrierssvy) %>% add_n(statistic="{p_nonmiss}")

Daniel D. Sjoberg
  • 8,820
  • 2
  • 12
  • 28
  • 1
    I'm not at a computer to give you a full response. The tbl_svysummary() function is going to produce denominators consistent with the data passed. If the observations have been removed, the denominator cannot be adjusted to account for data no longer in the data object. What you can do is build two separate tbls, one on the full data and one on the subset. Then stack the tables using tbl_stack(). – Daniel D. Sjoberg Dec 24 '21 at 20:48
  • 1
    Please post sample data or make a dummy_df. head(df) %>% dput() – OTA Dec 24 '21 at 20:50
  • Thanks for the comments. I'm new to R so will work on making some sample data. – researchgnome9000 Dec 24 '21 at 23:24
  • @DanielD.Sjoberg - Thanks. The issue is not that I've dropped observations, but that the missing counts are not counting towards the total. So in the example image, 41 people could have checked the box, 34 did and 7 are 'missing'. I want the % displayed to be 34/41, not 34/34. The observations I dropped are from people who never saw the question (i.e., not part of the 41). – researchgnome9000 Dec 24 '21 at 23:28
  • 1
    @researchgnome9000: You can use the [`reprex`](https://reprex.tidyverse.org/articles/articles/magic-reprex.html) and [`datapasta`](https://cran.r-project.org/web/packages/datapasta/vignettes/how-to-datapasta.html) packages to quickly create a reproducible example so others can help. Please do not use `str()`, `head()` or screenshot. See also [Help me Help you](https://speakerdeck.com/jennybc/reprex-help-me-help-you?slide=5) & [How to make a great R reproducible example?](https://stackoverflow.com/q/5963269) – Tung Dec 25 '21 at 08:14
  • Oh I see. You can make the NA values explicit with this function. Then missing values will appear like any other level. https://forcats.tidyverse.org/reference/fct_explicit_na.html – Daniel D. Sjoberg Dec 25 '21 at 13:37

0 Answers0