I want to compare gender performance within the same school. However, R keeps importing my male_scores column as a factor, so to deal with that problem, I have written the code below:
The code has solved similar problems for me in the past, but in this case, the as.double() is not working effectively as it converts male_scores into a ranking from highest to lowest (i.e. 1,2, 3,4 etc).
I have further tried as.numeric(), which unfortunately also did not work.
library(ggplot2)
library(readxl)
library(tidyr)
library(dplyr)
library("ggalt")
gender_comparison <- read.csv(file = "dumbbell_plot.csv")
# Change variables from character into numeric format
gender_comparison <- gender_comparison %>%
mutate(male_scores = as.double(male_scores))
# Check format of variables
sapply(gender_comparison, class)
## Create a dumbbell plot to compare boys and girls performance within the same school
ggplot(gender_comparison, aes(x=male_scores, xend=female_scores, y=school_name,)) +
geom_dumbbell(size=1,color="grey",
colour_x = "blue", colour_xend = "red",
dot_guide=TRUE, dot_guide_size=0.20) +
scale_x_continuous(limits = c(70,90)) +
labs(x="Average Exam Scores", y="City",
title="The Gender Standardized Exams Score Gap Remains Prevalent Within Schools",
theme(panel.grid.major.x=element_line(size=0.20)) +
theme(panel.grid.major.y=element_blank())+
theme(axis.text.y=element_text(size = rel(0.55)))