1

Updated with sample data:

mydata <- read.table(header=TRUE, text="
Item    Value   Site
                     A  96  site1
                     B  1   site1
                     C  2   site1
                     A  1   site2
                     B  62  site2
                     A  19  site3
                     B  1   site3
                     C  11  site3
                     D  9   site3
                     ")

What I'm trying to do is plot a stacked bar chart and reorder the item variable differently for each site, sorting it by the value column. So for each site, the item with the biggest percentage value is at the bottom of the stacked bar, followed by the next largest percentage value and so on. However, I've tried different methods and had difficulty arranging by the stacked bars using the value column.

Edit with solution: follow the link marked as the answer above, plot each bar individually using geom_bar and add in the reorder function - aes(fill=reorder(Item, +Value))

Hack-R
  • 22,422
  • 14
  • 75
  • 131
  • Welcome to SO! Just a heads up - the "snippet" / Run Code feature won't work on R code (when you type a question on SO - it works for JS, CSS, and HTML I believe). – Hack-R Mar 01 '19 at 16:54
  • @lookingforbirds: have you tried this https://stackoverflow.com/questions/51098496/order-stacked-ggplot2-percentage-bar-plot-by-y-continuous-value-repost?rq=1 – Tung Mar 01 '19 at 16:55
  • In the example you give, it seems like sorting on one variable is enough---a consistent order will work between sites, e.g., A,E,D,B,G,F,C in one answer. Getting the order to be different in different bars (different sites) is much harder. **If that is what you are after**, I would recommend providing sample data that requires it, for example, make substrate A the highest percentage at one site and the lowest percentage at another. – Gregor Thomas Mar 01 '19 at 17:09
  • thanks for the comments! @Gregor I'm not trying to sort it based on a consistent order but based on the percentage values of the substrate, from largest to smallest at each site. Will add some additional sample data! – lookingforbirds Mar 01 '19 at 17:58
  • 1
    Excellent. I would recommend **simplifying your example** rather than just adding to it, to make that very clear. You don't need 7 substrates and 4 sites to get that point across---3 substrates and 2 or 3 sites is plenty. Just make sure it illustrates the problem---the answers so far implicitly sort the substrates by the median percentage (`fct_reorder`), the mean percentage `reorder`, or the max percentage (Hack-R's answer) across all sites. Make sure your example data would give *different substrate orderings for each site* since that it what you want. – Gregor Thomas Mar 01 '19 at 18:03
  • With a little poking around, [this seems like probable dupe](https://stackoverflow.com/q/53596262/903061)... IceCreamToucan's solution seems quite good. – Gregor Thomas Mar 01 '19 at 18:16
  • You're right, what I'm trying is to get different orderings for each site. And yes that link seems promising, thanks I'll give that a go! – lookingforbirds Mar 01 '19 at 18:20
  • I'm going to close this as a dupe. If you can't get it working, edit the question to show an attempt, and we'll re-open and figure it out. (Also, in that case, I'd again suggest editing your sample data to be **simple** ;) – Gregor Thomas Mar 01 '19 at 18:23
  • 1
    thanks! managed to get it to work:) – lookingforbirds Mar 01 '19 at 18:55

3 Answers3

4

Typically when I need to reorder a character variables for display, its easiest to convert the variable to a factor. Factors are a bit notorious in R for causing problems, but many of these can be easily overcome with help from the forcats package. In fact, the entire purpose of the package is to more easily handle working with factors in R.

The function forcats::fct_reorder is for reordering a factor by another variable (based on median values by default). In this particular instance, we are converting the substrate column into a factor, and reordering the substrate factor based on percentage...all in a single call.

library(ggplot2)
library(forcats)

ggplot(data = mydata, 
       aes(x = site, 
           y = percentage, 
           fill = fct_reorder(substrate, percentage))) + 
geom_bar(stat = "identity") +
guides(fill = guide_legend(title = "substrate"))

Which gives the following:

reordered plot

I prefer to call the fct_reorder call within my ggplot call, as this will not require changing the underlying data frame.

If you would like to read more about forcats, I suggest this tidyverse site, or if you want to learn more about factors within R in general, start with the chapter on factors in R for Data Science.


Note

This question has since changed since my original response. I believe an adequate answer can be found to the linked question here

Dave Gruenewald
  • 5,329
  • 1
  • 23
  • 35
  • thanks for the comment! I've tried this and my data just reorders in a consistent order, but not in accordance to the percentage value. Maybe the sample data I provided is inaccurate, I have uploaded my actual data which might be different. – lookingforbirds Mar 01 '19 at 18:02
  • This is exactly what I was looking for. Can you update your answer with a short description of the `forcats` package? – shadow_dev Mar 13 '20 at 22:43
  • @shadow_dev Happy to help! I have updated my answer to provide more detail, along with supporting links – Dave Gruenewald Mar 15 '20 at 23:13
1

I think that forcats is a great way, but before we had that - or when you want to just use base R without libraries, this is more of the traditional approach:

mydata$substrate <- factor(mydata$substrate, levels = unique(mydata$substrate[order(mydata$percentage)]))

ggplot(data=mydata, aes(x=site, y=percentage), fill= Substrate) + 
  geom_bar(stat="identity",aes(fill=substrate))

enter image description here

Hack-R
  • 22,422
  • 14
  • 75
  • 131
  • thanks for the advice! I've tried this and my data just reorders in a consistent order, but not in accordance to the percentage value. Maybe the sample data I provided is inaccurate, I have uploaded my actual data which might be different. – lookingforbirds Mar 01 '19 at 18:03
1

This will work for you

ggplot(data=my_data, aes(x=site, y=percentage, fill=reorder(substrate, percentage))) +
  geom_bar(stat="identity", position="stack")

by using reorder command in the fill argument.

Lunalo John
  • 325
  • 3
  • 10
  • thanks for this! I've tried this and my data just reorders in a consistent order, but not in accordance to the percentage value. Maybe the sample data I provided is inaccurate, I have uploaded my actual data which might be different. – lookingforbirds Mar 01 '19 at 18:03