3

Following is the sample dataset that I have:

df <- structure(list(Class = c("A", "B", "C", "D"), 
`Attempted` = c(374, 820, 31, 108), 
`Missed` = c(291, 311, 5, 15), 
`Cancelled` = c(330, 206, 6, 5), 
`Unknown` = c(950, 341, 6, 13)), 
class = "data.frame", row.names = c(NA, -4L))

I want to create a mosaic plot with 'percentages' instead of absolute numbers. To be precise, I want to see what percentage of 'class A' people out of the total 'class A' population 'missed' their test? And, similarly for other class population.

I have not tried any code yet as I have absolutely no clue how to start. Can anyone please help me with this?

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
amty
  • 113
  • 1
  • 10

1 Answers1

2

Using only one package, you can do and note I am labeling the cells with the proportions in each class (i.e rows sum up to 1):

library(vcd)
M = as.table(as.matrix(df[,-1]))
names(dimnames(M)) = c("Class","result")
labs <- round(prop.table(M,margin=1), 2)
mosaic(M, pop = FALSE)
labeling_cells(text = labs, margin = 0)(M)

enter image description here

You can also just visualize it with a simple

library(RColorBrewer)
barplot(t(labs),col=brewer.pal(4,"Set2"))

legend("bottomright",legend = colnames(labs),inset=c(0,1.1), xpd=TRUE, 
       fill =brewer.pal(4,"Set2"),horiz=TRUE,cex=0.7)

enter image description here

If you use ggplot2 and another other gg stuff, you need to pivot your data long:

library(tidyr)
library(dplyr)
library(ggplot2)

df_long = df %>% 
pivot_longer(-Class) %>% 
group_by(Class) %>% 
mutate(total = sum(value),
       p = round(100*value/total,digits=1)) %>%
ungroup()

ggplot(df_long,aes(x=Class,y=p,fill=name)) + geom_col() + geom_text(aes(label=p),position=position_stack(vjust=0.2))

enter image description here

If you want to use ggplot2, you need to modify this answer by z.lin, note I take the sqrt to make the smaller plots more visible:

ggplot(df_long,
       aes(x = Class, y = p, width = sqrt(total), fill = name)) +
geom_col(colour = "black") +
geom_text(aes(label = p), position = position_stack(vjust = 0.5)) + 
facet_grid(~Class, scales = "free_x", space = "free_x") +
theme_void()

enter image description here

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
  • Thank you @StupidWolf, for your time and efforts. That's a pretty elaborate answer to my question. It saved my day. – amty Nov 20 '20 at 04:33