0

Context: when you have "many" categories it can become hard to distinguish them in a bar plot. I found the plot below dealing with this situation quite nicely by linking the legend with categories in the plot.

enter image description here

Question: is it possible to do something similar with ggplot2?

With ggplot2 it is straighforward to get this:

enter image description here

But I really do not know were to start to acheive the result shown in the 1st plot.

Here is some code to sort it out:

library(ggplot2)

ggplot(data = mtcars, aes(x = vs, y = disp, fill = factor(carb))) +
  geom_bar(stat = "identity")

Expected output (not as nice as the one presented above but it shows the idea) enter image description here

Paul
  • 2,850
  • 1
  • 12
  • 37

1 Answers1

0

There is no proper legend on the axes in any of the plots, but my guess is that the desired chart is based on relative frequencies, while your plot seems to show absolute frequencies, though I'm not sure about that.

Assuming that you want to produce a stacked bar chart giving the (relative) number of observations of a categorial variable in two groups, there are two ways to get the two stacked bars to be of the same height:

  1. There need to be the exact same amount of observations in both of them. Then you can use absolute frequencies.
  2. The absolute frequencies need to be transformed to relative frequencies (or percent) by dividing them by the total number of observations in each group.

You can calculate the relative frequencies yourself and use them as the y-values.

Or refer to this post, as it seems to describe exactly what you want using ggplot2.

Codebird
  • 91
  • 8
  • Hello @Codebird and thanks for your answer. I am sorry if my question is not clear enough. It is not about relative/absolute frequencies but about how to **draw** a line between the legend and the plot to better show what fill color relates to each category. I'll edit my question so it is clearer. – Paul May 03 '21 at 13:10