-5

I got a df where variables 1-5 is scale with values total counts.

df<-data.frame(
  speed=c(2,3,3,2,2),
  race=c(5,5,4,5,5),
  cake=c(5,5,5,4,4),
  lama=c(2,1,1,1,2))

library(data.table)
dcast(melt(df), variable~value)

#  variable 1 2 3 4 5
#1    speed 0 3 2 0 0
#2     race 0 0 0 1 4
#3     cake 0 0 0 2 3
#4     lama 3 2 0 0 0 

I want to do stacked bar chart with mean and scale variables 1-5 on x axe by variables in first column (speed, race, cake, lama).

I tried solution from Stacked Bar Plot in R, but there is not what I am looking for.

enter image description here

  • Does this answer your question? [Stacked Bar Plot in R](https://stackoverflow.com/questions/20349929/stacked-bar-plot-in-r) – user438383 Mar 28 '21 at 10:40
  • @user438383 No, it doesnt. – David Kružlík Mar 28 '21 at 10:48
  • Care to explain why? There’s dozens and dozens of good answers on how to make a stacked bar plot. Also it’s customary to show some evidence of what you tried before posting. – user438383 Mar 28 '21 at 11:12
  • @user438383 Idk, how to select variables 1-5 for X axe, Idk how to add mean to the chart. Long story short, Idk how to create simillar stacked bar plot as on posted pic. – David Kružlík Mar 28 '21 at 11:30
  • Edit your question to include those details then or else people will just assume you want a standard bar chart. – user438383 Mar 28 '21 at 11:52
  • @user438383 I tried a lot. I gave you the data.frame, u can try to create it. There is vizualization of my idea as well. – David Kružlík Mar 28 '21 at 11:54
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/230468/discussion-between-david-kruzlik-and-user438383). – David Kružlík Mar 28 '21 at 11:55

1 Answers1

2

I had to try a few things and do some workarround to get something very close to want you are looking for (given that I understood the problem correctly):

library(dplyr)
library(ggplot2)
library(tidyr)

df<-data.frame(
  speed=c(2,3,3,2,2),
  race=c(5,5,4,5,5),
  cake=c(5,5,5,4,4),
  lama=c(2,1,1,1,2))

 # get the data in right shape for ggplot2
 dfp <-  df %>% 
   # a column that identifies the rows uniquely is needed ("name of data row")
   dplyr::mutate(ID = as.factor(dplyr::row_number())) %>% 
   # the data has to shaped into "tidy" format (similar to excel pivot)
   tidyr::pivot_longer(-ID) %>% 
   # order by name and ID
   dplyr::arrange(name, ID) %>% 
   # group by name 
   dplyr::group_by(name) %>% 
   # calculate percentage and cumsum to be able to calculate label position (p2)
   dplyr::mutate(p = value/sum(value),
                 c= cumsum(p),
                 p2 = c - p/2,
                 # the groups or x-axis values have to be recoded to numeric type
                 name = recode(name, "cake" = 1, "lama" = 2, "race" = 3, "speed" = 4))

# calculate the mean value per group (or label) as you want them in the plot
sec_labels <- dfp %>%
  dplyr::summarise(m = mean(value)) %>%
  pull(m)

dfp %>% 
  # building base plot, telling to fill by the new name variable
  ggplot2::ggplot(aes(x = name, y = value, fill = ID)) +
  # make it a stacked bar chart by percentiles
  ggplot2::geom_bar(stat = "identity", position = "fill") +
  # recode the x axis labels and add a secondary x axis with the labels
  ggplot2::scale_x_continuous(breaks = 1:4,
                              labels = c("cake", "lama","race", "speed"),
                              sec.axis = sec_axis(~., 
                                                  breaks = 1:4,
                                                  labels = sec_labels)) +
  # flip the chart by to the side
  ggplot2::coord_flip() +
  # scale the y axis (now after flipping x axis) to percent
  ggplot2::scale_y_continuous(labels=scales::percent) +
  # add a layer with labels acording to p2
  ggplot2::geom_text(aes(label = value, 
                         y=p2)) +
  # put a name to the plot
  ggplot2::ggtitle("meaningfull plot name") +
  # put the labels on top 
  ggplot2::theme(legend.position = "top") 

enter image description here

DPH
  • 4,244
  • 1
  • 8
  • 18