0

I have data like below data;

PersonId (Uniq) Disease Survival
1 A 1
2 B 0
3 A 0
4 C 1
5 B 0
6 D 1
7 C 0
8 A 1
9 D 0
10 D 1

I want to get a ratio from this data table. Calculation of this ratio;

Survival rate by disease: Number of survivors (1) by disease / Total number of people by disease

As a result of this calculation, I want to create a table as follows;

Disease Total number of people Number of Survivors Oran
A 3 2 0.66
B 2 0 0
C 2 1 0.5
D 3 2 0.66

I don't know where to start, what kind of code should I write to get a table like this.

Coder
  • 41
  • 6
  • 1
    Does this answer your question? [How to sum a variable by group](https://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group) – Limey Nov 08 '21 at 06:37

2 Answers2

2

Using base R:

tab<-t(rbind(table(df$Disease),
      tapply(df$Survival,df$Disease,sum),
      tapply(df$Survival,df$Disease,mean)))
tab<-as.data.frame(tab)
names(tab)<-c('Frequency','Survived','Ratio')
tab
#   Frequency Survived     Ratio
# A         3        2 0.6666667
# B         2        0 0.0000000
# C         2        1 0.5000000
# D         3        2 0.6666667

The dataset:

df<-data.frame(Disease=c('A','B','A','C','B','D','C','A','D','D'),
               Survival=c(1,0,0,1,0,1,0,1,0,1))
DeBARtha
  • 460
  • 6
  • 17
1

I'd use dplyr:

library(dplyr)

df %>% group_by(Disease) %>%
   summarize(Total=n(), Survivors=sum(Survival), Oran=mean(Survival))

Output:

  Disease Total Survivors  Oran
  <chr>   <int>     <int> <dbl>
1 A           3         2 0.667
2 B           2         0 0    
3 C           2         1 0.5  
4 D           3         2 0.667
U13-Forward
  • 69,221
  • 14
  • 89
  • 114