I'm trying to plot a boxplot in R using ggplot2.
here's my code with sample data:
df = structure(list(Closeness = c(0.0919540229885057, 0.0950259836674091, 0.0957367240089753, 0.0960240060015004, 0.0901408450704225, 0.0970432145564822, 0.0939794419970631, 0.0943952802359882, 0.0921526277897768, 0.0914285714285714, 0.0933625091174325, 0.0953090096798213, 0.0917562724014337, 0.0960960960960961, 0.0937728937728938, 0.0909090909090909, NA, 0.0946045824094605, 0.0864280891289669, 0.0879120879120879, 0.0905233380480905, 0.100313479623824, 0.0993017843289372, 0.0942562592047128, 0.0950965824665676, 0.0907801418439716, NA, NA, 0.0950965824665676, 0.0913633119200571, NA, 0.0926864590876177, NA, 0.0948148148148148, 0.0958801498127341, 0.0945347119645495, 0.0931586608442504, 0.090014064697609, 0.0968229954614221, 0.0963855421686747, 0.0926193921852388, 0.0919540229885057, 0.0947446336047372, 0.0917562724014337, 0.0905874026893135, 0.0950965824665676, NA, 0.0926193921852388, 0.0900774102744546, 0.0977845683728037), Var1 = c("Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group", "Group"), Var2 = c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "A", "A", "K", "K", "G", "G", "N", "N", "O", "O", "A", "P", "P", "P", "Q", "Q", "Q", "Q", "A", "A", "A", "A", "R", "R", "R", "R", "S", "S", "S", "S", "L", "L", "L", "L", "L", "L", "L")), .Names = c("Closeness", "Var1", "Var2"), row.names = c(NA, 50L), class = "data.frame")
tmp <- data.frame(df, check.names=T)
tmp <- melt(tmp, id="Closeness", variable.name="Var1", value.name="Var2")
tmp$Var1 <- gsub("(.*)\\.[0-9]", "\\1", tmp$Var1)
df <- subset(tmp, Var2!="")
df_g = subset(df, Var1=="Group")
df_c = subset(df, Var1=="Cat")
ggplot(df_c, aes(x = df_g$Var2, y = df_g$Closeness), position = "dodge") + # geom_point() +
geom_boxplot(outlier.size = 1.5) #+ geom_jitter(position=position_jitter(width=.2, height=0))
Which produces this (with the full dataset):
Now, I have two problems:
- I'd like the categories (A, B, C, D) to be ordered by descending mean;
- Some categories only have one sample (ie. B, D, and E). I'd like to remove them before plotting.
Is this possible using ggplot2? If so, how to proceed?