I have a df that I intend to visualise as a stacked percentage bar plot, with the stacks ordered in descending order. The df contains values in proportion and has been transformed into long format. Below is a reprex with some dummy data that is based on the real data I'm using.
df<- data.frame(ID=c("A","B","C","D","E"),
a1=c((0.452),(0.558),(0.554),(0.484),(0.661)),
a2=c((0.326),(0.373),(0.465),(0.434),(0.499)),
a3=c((0.450),(0.481),(0.613),(0.473),(0.504)),
a4=c((0.561),(0.681),(0.633),(0.504),(0.723)))
dflong<-df%>%
pivot_longer(!ID, names_to="aa", values_to="prop")
dflong$ID<-as.factor(dflong$ID)
# A tibble: 15 × 3
ID aa prop
<fct> <chr> <dbl>
1 A a1 0.452
2 A a2 0.326
3 A a3 0.45
4 A a4 0.561
5 B a1 0.558
6 B a2 0.373
7 B a3 0.481
8 B a4 0.681
9 C a1 0.554
10 C a2 0.465
11 C a3 0.613
12 C a4 0.633
13 D a1 0.484
14 D a2 0.434
15 D a3 0.473
dflong %>%
ggplot(aes(x=ID,y=prop, fill=reorder(aa,-prop))) +
geom_col(position ="fill", data=dflong%>%filter(ID=="A")) +
geom_col(position ="fill", data=dflong%>%filter(ID=="B")) +
geom_col(position ="fill", data=dflong%>%filter(ID=="C")) +
geom_col(position ="fill", data=dflong%>%filter(ID=="D")) +
geom_col(position ="fill", data=dflong%>%filter(ID=="E")) +
geom_text(aes(label=scales::percent(prop)),
position=position_fill(vjust=.5), size=3, colour="black") +
scale_y_continuous(labels = NULL, breaks = NULL)+
scale_fill_brewer(palette="GnBu",
name="")+
coord_flip()+
theme_minimal()+
theme(legend.position = "bottom",
legend.direction = "horizontal") +
labs(caption="",
x="",
y="")
My problem is that the resulting plot will always switch between the 2 values within the variable "C", that is the stacked bar chart will exchange the values of C-a1 and C-a3. The image of the plot will demonstrate the issue clearly.
For variable C, a1 should be 55.4% and a3 should be 61.3%
I have tried changing ID into factor, changing aa into factor, reordering the values in the original df, restarting a new session, updating RStudio, and running the code on the R GNU (in case it was an RStudio issue), but nothing I have done so far has fixed this problem. It seems to only affect 1 'variable' which is the C variable, and only for values of a1 and a3. I'm at my wits ends and will appreciate any kind of help, as the console isn't specifying an exact error that's causing this.
EDIT: The reason I have used five separate geom_col
functions is to ensure that each columns are stacked horizontally. This was largely influenced by the answer to this question.