0

I made an dumbbell chart to show the difference of product sales among different time windows (e.g.weekday VS weekend), and wanted to select the 20 most distinguished products in an descending order. But it seems my order and selection doesn't work properly.

Here is the data for dumbbell chart :

>head(product_dumbbell)

product_aisle     daytime evening long medium short weekday weekend
1: candles           16       4    2      6    12      15       5
2: asian foods      115     25    23     29    88      90      50
3: baby accessories   7      3     0      0    10       7       3
4: baby body care     4      3     1      2     4       7       0
5: baby food formula 149    44    24     29    140    142      51
6: bakery desserts    53    11     6      6     52     47      17

And my code for the dumbbell chart is like:

product_dumbbell%>%
   top_n(20)%>%
   ggplot() +
   aes(x=weekday, xend=weekend, y=product_aisle, 
       group=product_aisle) + 
   geom_dumbbell(color="#a3c4dc", 
            size=0.75, 
            colour_x="#edae52", 
            colour_xend = "#9fb059") + 
   labs(x=NULL, 
        y=NULL, 
        title="Product Dumbbell Chart: weekend VS weekday") +
    theme(plot.title = element_text(hjust=0.5, face="bold"),
          plot.background=element_rect(fill="#f7f7f7"),
          panel.background=element_rect(fill="#f7f7f7"),
          panel.grid.minor=element_blank(),
          panel.grid.major.y=element_blank(),
          panel.grid.major.x=element_line(),
          axis.ticks=element_blank(),
          legend.position="top",
          panel.border=element_blank())

R reminded me that the result is selected by weekend. Actually I want to select top 20 by their difference values between weekend and weekday, and place them in descending order.

Is there anybody who have made a dumbbell chart can help me? Thanks a lot!

1: https://i.stack.imgur.com/SVxFw.pngenter image description here

markus
  • 25,843
  • 5
  • 39
  • 58
frida guo
  • 61
  • 1
  • 7

1 Answers1

0

So if I understood right you want top 20 differences in descending order.
First you have to create a column with the differences with mutate. In the code I use abs to give the absolute values. With the variable you can get the top 20 differences, giving the proper weight to the funtion.
Then you rearrange the labels according to other variable, I used difference in this example with the - so it's descending.

library(dplyr)
library(ggalt)

product_dumbbell%>%
  mutate(difference = abs(weekend-weekday)) %>% #creates the variable of differences
  top_n(20, wt = difference) %>% # Choose the rows with top 20 difference
  ggplot() +
  aes(x=weekday, xend=weekend, y=reorder(product_aisle, -difference), 
      group=product_aisle) + #reorder the labels by descending difference value
  geom_dumbbell(color="#a3c4dc", 
                size=0.75, 
                colour_x="#edae52", 
                colour_xend = "#9fb059") + 
  labs(x=NULL, 
       y=NULL, 
       title="Product Dumbbell Chart: weekend VS weekday") +
  theme(plot.title = element_text(hjust=0.5, face="bold"),
        plot.background=element_rect(fill="#f7f7f7"),
        panel.background=element_rect(fill="#f7f7f7"),
        panel.grid.minor=element_blank(),
        panel.grid.major.y=element_blank(),
        panel.grid.major.x=element_line(),
        axis.ticks=element_blank(),
        legend.position="top",
        panel.border=element_blank())
Jorge Mendes
  • 176
  • 1
  • 6
  • It seems that the order is ascending but not descending, with the largest difference at the bottom – frida guo Aug 12 '19 at 12:52
  • Just remove the `-` before `difference` in the the `aes` line. `aes(x=weekday, xend=weekend, y=reorder(product_aisle, difference)` – Jorge Mendes Aug 12 '19 at 15:00