I would like to plot several boxplots in one chart and know there are already similar threads out there, but none seem to be applicable to my case.
Description of the data I need to visualize: I have voter survey data on which policy areas are perceived as important (x) and I want to contrast that by data on how likely these voters would ever vote for a given party (y). This is straightforward if I want to do it only for one party, but the idea is to plot it for three parties all in one figure (otherwise we would have too many figures and it'd be hard to compare them to one another).
So let's take these hypothetical data (MyData):
Party_A Party_B Party_C Salience
8 2 5 "Environmental policy"
7 0 4 "Environmental policy"
9 3 6 "Environmental policy"
0 9 4 "Tax policy"
1 8 3 "Tax policy"
2 6 3 "Tax policy"
2 3 9 "Immigration policy"
3 5 9 "Immigration policy"
1 6 0 "Immigration policy"
Where "Party_A:C" represents "Would you ever consider voting for...(0-10 scale)" and "Salience" simply indicates which policy areas they mentioned as being important. (I also have another set of binary variables that go 1 if a policy area is mentioned and 0 if not and the names of these variables are simply the given policy area -- just in case this is needed).
Now this is what I tried:
library(ggplot2)
ggplot(MyData, aes(Salience,Party_A)) + geom_boxplot(fill="black", alpha=.5) +
geom_boxplot(aes(Salience,Party_B), fill="blue", alpha=.5) +
geom_boxplot(aes(Salience,Party_C), alpha=.5) +
geom_hline(yintercept=5, color="darkred", linetype="dotted") +
theme(text=element_text(family="serif"), panel.background=element_blank(),
axis.text.x=element_text(angle=90,hjust=1,vjust=.3))
There are two issues with this that I cannot get solved:
- The boxes are obviously on top of each other and even with alpha=.5 it still looks messy and cannot compare anything. Thus, is there a way to have them grouped kind of like a cluster of three boxes for each policy area? It would obviously be nice to do it like here, but my data structure clearly doesn't allow for an implementation of this simply by including
fill=labels
as a group indicator. - Another issue is the NA-bar that I cannot get rid of: I tried both to include
na.omit()
in the ggplot-code and subsetting it beforehand by doing this:MyData[!is.na(MyData)]
. In both cases the chart disappears.
Is there any solution to this? Grateful for any advice!