Background: Point-biserial correlation is used to measure the relationship between a binary variable, x, and a continuous variable, y.
Methods:
I use the cor.test()
function to calculate R
and p-value
:
# the two vectors
x <- mtcars$am
y <- mtcars$mpg
#calculate point-biserial correlation
cor_result <- cor.test(x, y)
cor_result$p.value
cor_result$estimate
The I use ggplot2 to plot it this way, the numbers within the points denote for cylinder
:
library(see) # theme_modern()
library(dplyr)
library(ggplot2)
# plot
mtcars %>%
mutate(am = factor(am)) %>%
mutate(id = row_number()) %>%
ggplot(aes(x=id, y=mpg, color=am, label = cyl )) +
geom_point(size = 8, alpha=0.5)+
geom_text(color = "black", hjust=0.5, vjust=0.5)+
scale_color_manual(values = c("steelblue", "purple"), labels = c("No", "Yes"))+
scale_x_continuous(breaks = 1:32, labels = 1:32)+
scale_y_continuous(breaks= scales::pretty_breaks())+
geom_text(aes(x = 10, y = 30,
label = ifelse(am == 0, "R = 0.5998324, p = 0.0002850207", "")),
color = "black",
size = 4) +
facet_wrap(. ~ am,
nrow = 1, strip.position = "bottom") +
labs(y = "mpg",
color="Automatic vs Manual transmission")+
theme_modern()+
theme(
aspect.ratio = 2,
strip.background = element_blank(),
strip.placement = "outside",
legend.position = "bottom",
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
text=element_text(size=16)
)
My question
Would you consider this as an appropriate figure to show the correlation of am
and mpg
.
Could you give me a hint to improve this plot.