-1

I have a tibble where each patient can be observed several times. So names are like this : id_patient (num) ; id_eval (num) ; treat_1 (logical) ; treat_2 (logical) ; treat_1_type (char) ; treat_2_type (char).

What I want : a summary table (with tbl_summary) describing unique values to know how many patients have, at least 1 time, be concerned by a possibility. Something like this :

var All patients (n=N)
treat_1 AA (aa %)
treat_2 BB (bb %)
treat_1_type
- Type_1 CC (cc %)
- Type_2 DD (dd %)
treat_2_type
- Type_1 EE (ee %)
- Type_2 FF (ff %)
- Type_3 GG (gg %)

What I have for now is :

evals %>%
    group_by(id_patient) %>%
    select(id_patient, treat_1, treat_2) %>%
    summarise(across(everything(), .fns = unique))
    summary()

But that gives me all existing TRUE/FALSE combinations, so it does not represent really unique values. And this is for the logical part so the easy one, it will not work with factors...

How do you think I can achieve that ?

Galactose
  • 175
  • 3
  • 8
  • Please `dput()` your data! – TarJae Sep 05 '21 at 11:48
  • It would be easier to help if you create a small reproducible example along with expected output. Read about [how to give a reproducible example](http://stackoverflow.com/questions/5963269). – Ronak Shah Sep 05 '21 at 13:35

1 Answers1

2

I wish you had given us a bit of data. But let's produce them ourselves.

library(tidyverse)

n=10
evals = tibble(
  id_patient = sample(1:50, n, replace = T),
  id_eval = sample(120:277, n),
  treat_1 = sample(c(T, F), n, replace = T),
  treat_2 = sample(c(T, F), n, replace = T),
  treat_1_type = sample(c("Type_1", "Type_2"), n, replace = T),
  treat_2_type = sample(c("Type_1", "Type_2", "Type_3"), n, replace = T)
)

evals

output

# A tibble: 10 x 6
   id_patient id_eval treat_1 treat_2 treat_1_type treat_2_type
        <int>   <int> <lgl>   <lgl>   <fct>        <fct>       
 1         42     237 TRUE    FALSE   Type_2       Type_3      
 2         24     240 FALSE   FALSE   Type_1       Type_1      
 3         10     236 TRUE    FALSE   Type_1       Type_3      
 4         27     153 TRUE    FALSE   Type_1       Type_2      
 5         29     126 TRUE    FALSE   Type_2       Type_1      
 6         18     194 FALSE   TRUE    Type_1       Type_2      
 7         18     215 TRUE    FALSE   Type_2       Type_2      
 8         48     205 TRUE    FALSE   Type_1       Type_3      
 9         12     131 FALSE   FALSE   Type_1       Type_2      
10         13     225 FALSE   FALSE   Type_2       Type_3         

Is it okay? I hope so. Now let's do a summary as you like.

seval = evals %>%
  group_by(id_patient) %>%
  summarise(
    treat_1 = sum(treat_1)>0,
    treat_2 = sum(treat_2)>0,
    treat_1_Type_1 = sum(treat_1_type=="Type_1")>0,
    treat_1_Type_2 = sum(treat_1_type=="Type_2")>0,
    treat_2_Type_1 = sum(treat_2_type=="Type_1")>0,
    treat_2_Type_2 = sum(treat_2_type=="Type_2")>0,
    treat_2_Type_3 = sum(treat_2_type=="Type_3")>0
  ) %>% summarise(
    treat_1 = sum(treat_1),
    treat_2 = sum(treat_2),
    treat_1_Type_1 = sum(treat_1_Type_1),
    treat_1_Type_2 = sum(treat_1_Type_2),
    treat_2_Type_1 = sum(treat_2_Type_1),
    treat_2_Type_2 = sum(treat_2_Type_2),
    treat_2_Type_3 = sum(treat_2_Type_3)
  )


output

# A tibble: 1 x 7
  treat_1 treat_2 treat_1_Type_1 treat_1_Type_2 treat_2_Type_1 treat_2_Type_2 treat_2_Type_3
    <int>   <int>          <int>          <int>          <int>          <int>          <int>
1       6       1              6              4              2              4              4

Now you can easily calculate the proportions

seval %>% 
  pivot_longer(everything(), names_to = "var", values_to = "val") %>% 
  group_by(var) %>% 
  mutate(prop = val/length(unique(evals$id_patient)))

output

# A tibble: 7 x 3
# Groups:   var [7]
  var              val  prop
  <chr>          <int> <dbl>
1 treat_1            6 0.667
2 treat_2            1 0.111
3 treat_1_Type_1     6 0.667
4 treat_1_Type_2     4 0.444
5 treat_2_Type_1     2 0.222
6 treat_2_Type_2     4 0.444
7 treat_2_Type_3     4 0.444  

I tested everything for both chr and factor variables and everything works fine.

Marek Fiołka
  • 4,825
  • 1
  • 5
  • 20