0

I want to compare gender performance within the same school. However, R keeps importing my male_scores column as a factor, so to deal with that problem, I have written the code below:

The code has solved similar problems for me in the past, but in this case, the as.double() is not working effectively as it converts male_scores into a ranking from highest to lowest (i.e. 1,2, 3,4 etc).

I have further tried as.numeric(), which unfortunately also did not work.

library(ggplot2)
library(readxl)
library(tidyr)
library(dplyr)
library("ggalt")

gender_comparison <- read.csv(file = "dumbbell_plot.csv")

# Change variables from character into numeric format  
gender_comparison <-  gender_comparison %>% 
  mutate(male_scores = as.double(male_scores))

# Check format of variables
sapply(gender_comparison, class)

## Create a dumbbell plot to compare boys and girls performance within the same school
ggplot(gender_comparison, aes(x=male_scores, xend=female_scores, y=school_name,)) + 
  geom_dumbbell(size=1,color="grey", 
                colour_x = "blue", colour_xend = "red",
                dot_guide=TRUE, dot_guide_size=0.20) +
  scale_x_continuous(limits = c(70,90)) +
  labs(x="Average Exam Scores", y="City", 
       title="The Gender Standardized Exams Score Gap Remains Prevalent Within Schools", 
  theme(panel.grid.major.x=element_line(size=0.20)) +
  theme(panel.grid.major.y=element_blank())+
  theme(axis.text.y=element_text(size = rel(0.55)))

enter image description here

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
maldini425
  • 307
  • 3
  • 14
  • 1
    One solution might be simply to pass `stringsAsFactors = FALSE` to `read.csv`. Also you might look at `help(as.numeric)` for advice on converting factor labels to numeric: `as.numeric(levels(male_scores))[male_scores]` – sboysel Oct 20 '19 at 03:43
  • See [this SO post](https://stackoverflow.com/q/3418128/3277821) for more details on converting factors to numeric – sboysel Oct 20 '19 at 03:46

0 Answers0