0

I am trying to graph a discrete integer number a by another discrete number b. I want to show the median and IQR of b for each a with the X-axis sorted by the median of b. I have used the following code:

a<-sample(1:10, 1000, replace=TRUE)
b <- sample(1:300, 1000, replace=TRUE)
df<-data.frame(a,b)
ggplot(data=df, aes(x = reorder(a, b, median),
                    y = b)) +
    labs(x="a", y="b", title="Variation of b by a") + 
    stat_summary(fun.y    = function(z) { median(z) },
                 fun.ymin = function(z) { quantile(z, 0.25) },
                 fun.ymax = function(z) { quantile(z, 0.75) }) + 
    geom_smooth(method = "lm") +
    theme_classic()

And end up with the following result:

enter image description here

This is what I'm looking for except that the last point is not in line with the other points. Why is the last point not sorted properly?

TimF
  • 121
  • 2
  • 8
  • 2
    can you please add a reproducible example? https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Melissa Key Jun 26 '18 at 19:23
  • When I run the code (and data) you shared in a new R session, it works just fine. Can't reproduce the problem... – Gregor Thomas Jun 26 '18 at 19:42
  • Hi, I cant share the specific data, but I added the creation of a data frame that should be able to recreate what I have. I am more interested to know if anyone knows why I am getting a result that isn't "completely" sorted. – TimF Jun 26 '18 at 19:43
  • Yea, I dont actually know how to recreate the problem I am having without the data I have, and unfortunately I cannot share that. – TimF Jun 26 '18 at 19:43
  • 1
    Not much we can do if we can't see the problem. I would suggest that perhaps you could anonymize your data (or anonymize a subset of it that shows the problem). I haven't used it, but you could try out the [`anonymizer` package](https://CRAN.R-project.org/package=anonymizer). Find the smallest possible subset that illustrates the problem and see if you can find the problem or alter it enough to share. Like, if you try your code on `df2 = subset(your_data, a %in% c("20", "16", "15"))` do you still get the problem? Etc. – Gregor Thomas Jun 26 '18 at 19:46
  • https://stackoverflow.com/questions/16622979/reorder-not-correctly-reordering-a-factor-variable-in-ggplot suggests that clearing the workspace and restarting R may work. – Weihuang Wong Jun 28 '18 at 04:09

0 Answers0