1

I have data as follows:

library(ggplot2)
library(tidyr)
library(dplyr)

DT <- structure(list(value = structure(c(2, 3, 3, 4, 4, 5, 6, 6, 7, 
7, 8, 8, 9, 9, 1, 1, 2, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9
), label = NA_character_, class = c("labelled", "numeric")), 
    penalty = structure(c(0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 
    0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1), label = "", class = c("labelled", 
    "numeric")), count = c(1L, 1L, 2L, 3L, 3L, 5L, 11L, 2L, 30L, 
    10L, 48L, 13L, 62L, 16L, 1L, 1L, 1L, 2L, 7L, 4L, 10L, 4L, 
    19L, 6L, 33L, 7L, 39L, 10L, 50L, 13L), type = c("Truth", 
    "Truth", "Truth", "Truth", "Truth", "Truth", "Truth", "Truth", 
    "Truth", "Truth", "Truth", "Truth", "Truth", "Truth", "Tax", 
    "Tax", "Tax", "Tax", "Tax", "Tax", "Tax", "Tax", "Tax", "Tax", 
    "Tax", "Tax", "Tax", "Tax", "Tax", "Tax"), x_label = c("2_Truth", 
    "3_Truth", "3_Truth", "4_Truth", "4_Truth", "5_Truth", "6_Truth", 
    "6_Truth", "7_Truth", "7_Truth", "8_Truth", "8_Truth", "9_Truth", 
    "9_Truth", "1_Tax", "1_Tax", "2_Tax", "3_Tax", "4_Tax", "4_Tax", 
    "5_Tax", "5_Tax", "6_Tax", "6_Tax", "7_Tax", "7_Tax", "8_Tax", 
    "8_Tax", "9_Tax", "9_Tax")), row.names = c(NA, -30L), class = c("data.table", 
"data.frame"))

   value penalty count  type x_label
 1:     2       0     1 Truth 2_Truth
 2:     3       0     1 Truth 3_Truth
 3:     3       1     2 Truth 3_Truth
...
28:     8       1    10   Tax   8_Tax
29:     9       0    50   Tax   9_Tax
30:     9       1    13   Tax   9_Tax

I tried to do a "bar plot with stack and dodge" as suggested in this link by Kent Johnson follows:

ggplot(DT, aes(x=x_label, y=count, fill=penalty)) +
  geom_bar(stat='identity') + labs(x='Value / Treatment') + 
  theme(legend.title = element_blank(), legend.position = c(0.1, 0.85))

enter image description here

However, I would like the chart to be more like the chart below (which is how I started out). Where at least the bars with the same value are together but have a different colour. If possible, also with the density function, the legend and the x-axis values.

Is there any way to do this?

enter image description here

EDIT: Suggestion of brianavery

enter image description here

Tom
  • 2,173
  • 1
  • 17
  • 44
  • some clarification of what variables you want to plot would be helpful, it seems like there is one less variable on the plot that you want to replicate? – brian avery Jan 29 '21 at 15:18
  • @brianavery What I want to plot it already there, it is the first picture I added. The only thing I want to change is the way it looks. The way I want it to look is the second picture (of course while keeping the "stack"). – Tom Jan 29 '21 at 15:20
  • if you take the answer from @tjebo but use `fill = as.factor(penalty)` for the fill to make it a factor and therefore discrete does that help with the color scale problem? I do think it will be very hard to get a density on there too with an approach using summarized data. – brian avery Jan 29 '21 at 15:30
  • I put you suggestion under EDIT. It only seems to change the colours.. – Tom Jan 29 '21 at 15:33

3 Answers3

4

you can try

DT %>%
  mutate(value =as.character(value)) %>% 
  complete(crossing(value,type, penalty), fill = list(count = NA)) %>% 
  ggplot(aes(x= value, y=count, fill =  type)) +
  geom_col(data = . %>% filter(penalty==0), position = position_dodge(width = 0.9), alpha = 0.2) + 
  geom_col(data = . %>% filter(penalty==1), position = position_dodge(width = 0.9), alpha = 1)  +
  geom_tile(aes(y=NA_integer_, alpha = factor(penalty)))

enter image description here

Roman
  • 17,008
  • 3
  • 36
  • 49
  • This great! Thank you so much! – Tom Jan 29 '21 at 15:40
  • Okay, I see why you chose factor 0 to be transparent.. I tried to change it, but then you cannot see the other colour any more. I have a feeling there is little to be done about that, or is there? The other way around would be exactly what I need. – Tom Jan 29 '21 at 15:42
  • 1
    interesting approach! Would you care adding this to the main related thread as well for better visibility? https://stackoverflow.com/questions/12715635/ggplot2-bar-plot-with-both-stack-and-dodge – tjebo Jan 29 '21 at 16:28
  • @tjebo thanks for the suggestion. I will do it in the the next days. – Roman Feb 02 '21 at 18:58
  • Please let me know - so I won't forget to upvote :) – tjebo Feb 02 '21 at 20:53
1

The only way I see to get both stack and dodge but with the bars close together would be to use facets and to fake a continuous graph. Apologies still without real R console, using rdrr.io/snippets right now.

library(tidyverse)

DT <- structure(list(value = structure(c(2, 3, 3, 4, 4, 5, 6, 6, 7, 
7, 8, 8, 9, 9, 1, 1, 2, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9
), label = NA_character_, class = c("labelled", "numeric")), 
    penalty = structure(c(0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 
    0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1), label = "", class = c("labelled", 
    "numeric")), count = c(1L, 1L, 2L, 3L, 3L, 5L, 11L, 2L, 30L, 
    10L, 48L, 13L, 62L, 16L, 1L, 1L, 1L, 2L, 7L, 4L, 10L, 4L, 
    19L, 6L, 33L, 7L, 39L, 10L, 50L, 13L), type = c("Truth", 
    "Truth", "Truth", "Truth", "Truth", "Truth", "Truth", "Truth", 
    "Truth", "Truth", "Truth", "Truth", "Truth", "Truth", "Tax", 
    "Tax", "Tax", "Tax", "Tax", "Tax", "Tax", "Tax", "Tax", "Tax", 
    "Tax", "Tax", "Tax", "Tax", "Tax", "Tax"), x_label = c("2_Truth", 
    "3_Truth", "3_Truth", "4_Truth", "4_Truth", "5_Truth", "6_Truth", 
    "6_Truth", "7_Truth", "7_Truth", "8_Truth", "8_Truth", "9_Truth", 
    "9_Truth", "1_Tax", "1_Tax", "2_Tax", "3_Tax", "4_Tax", "4_Tax", 
    "5_Tax", "5_Tax", "6_Tax", "6_Tax", "7_Tax", "7_Tax", "8_Tax", 
    "8_Tax", "9_Tax", "9_Tax")), row.names = c(NA, -30L), class = c("data.table", 
"data.frame"))

DT %>%
ggplot(aes(x= as.character(value), y=count, fill = penalty, group = type)) +
  geom_col(position = "dodge") + 
  labs(x='Value / Treatment') + 
  facet_wrap(~ value, scale = "free_x", nrow = 1)+
  theme(legend.title = element_blank(), legend.position = c(0.1, 0.85),
panel.spacing = unit(0, "inch")) 

This approach might make it tricky to produce a density curve, although the lates version of ggplot allows to do "cropped" density curves, this may also imitate continuity. Cannot try this out at the moment.

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • Thank you so much! Is there anything to be done about the colours? I tried adding `scale_fill_brewer(palette = "Set1") + ` but it says: `Error: Continuous value supplied to discrete scale` – Tom Jan 29 '21 at 14:55
  • @Tom of course. factorise/categorise penalty - e.g. as.character(penalty) – tjebo Jan 29 '21 at 15:35
  • It does change the colours now haha. But it does not give them a different colour (the bars that are next to each other always have the same colour). I would need four colours instead of two. – Tom Jan 29 '21 at 15:38
0

it seems easier to deal with the barchart part the way you have the data structured (as a summary), something like this:

ggplot(DT) +
  geom_col(aes(x=x_label, y=count, fill=as.factor(penalty)), position="dodge") + 
  labs(x='Value / Treatment') + 
  theme(legend.title = element_blank(), legend.position = c(0.1, 0.85))

enter image description here

making the variable you want to fill by a factor will make the color scale discrete and behave like I think you are after.

but I think the density will be difficult without the original data to plot (like a histogram).

can you use the original data to make a histogram and density? or provide an example of what it looks like?

brian avery
  • 403
  • 2
  • 8