2

I've created two grouped barplots from the same dataframe, and I'm having the same problem with both. In the first plot, I've grouped data by category (fields in this case) on the x axis, and filled each bar by year, which I've also categorized withfill=as.factor(Year). On the second graph I've switched the variables that are visualized on the x axis and the color of the bars (fill). In other words, the data is grouped by year on the x axis, and filled according to field (fill=as.factor(fnum)). It appears ggplot is ordering the bars based on the y values, but I'm trying to order the data consistenty in categorical, logical way (by year or field). Is there any way to specify order, either in the code for the plots or in the way I structure my dataframe? Thanks.

Code 1:

ggplot(data=OM, aes(factor(fnum), y=Value, fill=as.factor(Year))) + 
  geom_bar(stat="identity", position = "dodge")+
  labs(x='Field', y='Soil Organic Matter %', fill='Year',
    title = 'Organic Matter Plotted by Field and Year')+
  theme(axis.text.x = element_text(angle=65, vjust=0.6))

First plot

Code 2:

ggplot(data=OM, aes(factor(Year), y=Value, fill=as.factor(fnum))) + 
  geom_bar(stat="identity", position = "dodge",color='black')+
  labs(x='Year', y='Soil Organic Matter %', fill='Field',
       title = 'Organic Matter Plotted by Field and Year')+
  theme(axis.text.x = element_text(angle=65, vjust=0.6))

Second plot

neilfws
  • 32,751
  • 5
  • 50
  • 63
nbeck1999
  • 23
  • 1
  • 3
  • 2
    Hmmm... Can you dput(OM) ? I am really puzzled how you get this. – StupidWolf Jan 22 '20 at 23:55
  • 1
    Please read [how to make a great reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Providing such an example makes it much more likely that you receive a useful answer. – Axeman Jan 23 '20 at 00:10
  • Thank you @Axeman, I'll do that in the future. – nbeck1999 Jan 23 '20 at 03:22
  • @StupidWolf I see that dput() "Writes an ASCII text representation of an R object to a file or connection, or uses one to recreate the object." but I'm wondering if you could give some clarification of how that is helpful/useful - I'm new to stackoverflow so not quite sure how that function's used in this context. – nbeck1999 Jan 23 '20 at 03:30
  • @nbeck1999, there is lots of discussion about `dput` in the link I provided. It is often used to share data in a reproducible way. It doesn't solve your problem but helps you to ask a better question. – Axeman Jan 23 '20 at 15:47

2 Answers2

4

Your data are not organized by y values, if you look at the order of Field, on both plot, they are following the same order. r will automatically order factor vector in an ascending way (1 -> 9, A -> Z), so that's why your PRF values seems mis-ordered.

If you want to have a customed order, you could set it before using ggplot like this:

OM$fnum <- factor(OM$fnum, levels = c("PRF1","PRF1-2","PRF2","PRF3","PRF4","PRF5-1","PRF5-3","PRF6","PRF7","PRF8","PRF9","PRF10","PRF12"

Then, it should plot everything consistently using the order you specified.

Your column Years is already order in the ascending order as 2015 is before 2017 before 2019. So no need to change it.

Does it answer your question ? If not, you should consider to provide a reproducible example of your dataset (see here: How to make a great R reproducible example)


PS: BTW, it seems that your column Value used as y in your plots is in a factor format. Is it intentional ?

dc37
  • 15,840
  • 4
  • 15
  • 32
  • I did notice that ggplot was ordering ```field``` in numerical ascending order, but the ```levels``` attribute was exactly what I was looking for, so thank you for that. For the first plot however, there were a few instances where the 2019 bar (blue) came before 2017 bar (green), such as PRF 1, 3 and 9. I'm not sure why that is the case, as ```Year``` is an ```int``` variable and should be ordered as such. Nevertheless, ```levels``` should be able to fix that too. – nbeck1999 Jan 23 '20 at 03:15
  • 1
    Also thanks your postscript; it was not my intention to have ```Value``` be a factor variable, so I'll go ahead and fix that in my dataframe. – nbeck1999 Jan 23 '20 at 03:21
  • 1
    I think your p.s. is the entire issue – Axeman Jan 23 '20 at 04:40
1

Try something like this before your code:

OM$fnum <- factor(OM$fnum, levels = c("PRF1", "PRF12", "PRF2", "PRF3", 
                                    "PRF5-1", "PRF5-3", "PRF6", "PRF7", "PRF8", "PRF9", "PRF10", "PRF12")) 
Zhiqiang Wang
  • 6,206
  • 2
  • 13
  • 27