0

I have the following csv data:

component,command,error,errPercentage
init,acceptZone,MAX_ZONES,63.78014743
init,acceptZone,ERR_ZONE,32.93971707
init,acceptRefer,MAX_REFER,29.41176471
init,createZone,ERR_ZONE,2.47576316
init,getZone,NO_ZONE,1.107557237
init,joinZone,NO_ZONE,28.9197955
init,joinZone,ZONE_FULL,1.783299434
init,tempSet,RW_ONLY,17.30486009
init,removeSet,NON_AVAIL,1.50930378
init,sendSet,SEND_DEFERRED,2.529487652
init,sendSet,SEND_CANCELED,1.036675267
init,updateSet,NON_AVAIL,2.603311007

I'm using the following R code to create a chart:

df <- read.csv("zones.csv", header=TRUE, sep=",")
df %>%
  mutate(command = fct_reorder(command, errPercentage, .desc = FALSE)) %>%
  ggplot(aes(x=errPercentage, y=command)) +
  geom_bar(stat="identity", fill="#f68060", alpha=.6, width=.4) + 
  xlab("Error %") +
  labs(title="RPC Error Rates") +
  theme_bw()

This is the chart produced:

enter image description here

I want the rows sorted by errPercentage, so acceptRefer and joinZone should be one below the other. I was able to do it with barplot(), but ggplot is quite different (but I like the improved appearance, so I'd like to stick with it).

wayne
  • 95
  • 1
  • 4

1 Answers1

2

You have multple values in your dataset for each command value. When you draw the bar plot, those values are added together to get the sum. For example for "acceptZone" you have the values "MAX_ZONES" and "ERR_ZONE" which sum together to get 63.8 + 32.9 = 96.7. But by default the fct_reorder function doesn't take the sum of the values, it takes the median for all values within the group. You need to change the summary function to use the sum

This will do that:

df %>%
  mutate(command = fct_reorder(command, errPercentage, .fun=sum,  .desc = FALSE)) %>%
  ggplot(aes(x=errPercentage, y=command)) +
  geom_col(fill="#f68060", alpha=.6, width=.4) + 
  xlab("Error %") +
  labs(title="RPC Error Rates") +
  theme_bw()

enter image description here

MrFlick
  • 195,160
  • 17
  • 277
  • 295