3

I have prepared below function to plot dodge chart using ggplot:

frq_dodge2 <- function(chart_data) {

  sapphire<-c("#00A8C8","#006D9E","#002C77","#A6E2EF","#51d5ee","#1d5cc7")
  g<-ggplot(chart_data, aes(x=X, y=value,fill=Q))

  chart <-  g+ 
    geom_bar(position = position_dodge2(preserve = "single",width=0.9),stat='identity') +
    scale_fill_manual(values = sapphire)+
    labs(x= NULL, y= NULL, subtitle=NULL) +       
    ylab(NULL) +
    geom_text(chart_data = subset(chart_data,value!=0),aes(label=paste0(value,"%")),
              position=position_dodge2(width=0.9), vjust=-0.25,
              size=3,fontface="bold", colour="#404040") +
    labs(x=NULL, y=NULL)+
    scale_y_continuous( labels = number_format(suffix = "%"),
                        limits = c(min(0,min(chart_data$value)+min(chart_data$value)),
                                   max(0,max(chart_data$value) + max(chart_data$value) / 10)))+
    scale_x_discrete(labels = function(x) str_wrap(x, width = 10),limits=unique(chart_data$Stats))    
  chart
}

The issue when in the data one of the series is completely missing the bars are too wide, not looking good. For example for the below data the bars are plotted too wide.

> dput(expat)
structure(list(X = structure(c(1L, 1L), .Label = c("Less than 50", 
"50-100", "100-250", "250-500", "500-1000", "1000-3000", "3000-5000", 
"more than 5000"), class = "factor"), Q = structure(1:2, .Label = c("2018 (Actual)", 
"2019 (Forecast)"), class = "factor"), value = c(100, 100)), class = "data.frame", row.names = c(NA, 
-2L))

frq_dodge2(expat) will give the graph output

whereas in other data where the other series is not completely missing plot is ok:

> dput(localplus)
structure(list(X = structure(c(6L, 1L, 6L, 2L, 1L), .Label = c("Less than 50", 
"50-100", "100-250", "250-500", "500-1000", "1000-3000", "3000-5000", 
"more than 5000"), class = "factor"), Q = structure(c(1L, 1L, 
2L, 2L, 2L), .Label = c("2018 (Actual)", "2019 (Forecast)"), class = "factor"), 
    value = c(14, 86, 11, 22, 67)), class = "data.frame", row.names = c(NA, 
-5L))

I had used preserve="single" to fix the bars width in case of missing data in other series but this is not helping if other series is completely missing in the data (like in expat).

Is there any way to fix this?

  • `sapphire` is missing from what you provide. I have to comment that line to run your code because of this. – RoB Dec 04 '19 at 08:36
  • Sorry i missed adding sapphire in the code, edited the code. Thanks for your answer, its working fine this way but is there any way if i don't want show the x axis labels which have 0 value or n value? – Sandhya Ghildiyal Dec 04 '19 at 09:52
  • If you want to remove the labels of missing variables, you could set the `scale_x_discrete` labels to `""` for those. It should be fairly straightforward. – RoB Dec 04 '19 at 10:01
  • See my updated answer for the label removals. – RoB Dec 04 '19 at 10:14

1 Answers1

1

As mentioned in this answer, you need to have drop = FALSE set in your x scale calls.
In your function, that's the last line :

scale_x_discrete(labels = function(x) str_wrap(x, width = 10), drop = F)

For me, this yields the following :

enter image description here

EDIT : remove unneeded labels in x axis

Just check which levels are missing and change their label to "". The full function thus becomes :

frq_dodge2 <- function(chart_data) {
  
  sapphire<-c("#00A8C8","#006D9E","#002C77","#A6E2EF","#51d5ee","#1d5cc7")
  g<-ggplot(chart_data, aes(x=X, y=value,fill=Q))
  
  lvs <- levels(chart_data$X)
  miss_lvs <- which(!lvs%in%unique(chart_data$X))
  lvl_labs <- lvs
  lvl_labs[miss_lvs] <- ""
  
  chart <-  g+ 
    geom_bar(position = position_dodge2(preserve = "single",width=0.9),stat='identity') +
    scale_fill_manual(values = sapphire)+
    labs(x= NULL, y= NULL, subtitle=NULL) +       
    ylab(NULL) +
    geom_text(data = subset(chart_data,value!=0),aes(label=paste0(value,"%")),
              position=position_dodge2(width=0.9), vjust=-0.25,
              size=3,fontface="bold", colour="#404040") +
    labs(x=NULL, y=NULL)+
    scale_y_continuous( labels = number_format(suffix = "%"),
                        limits = c(min(0,min(chart_data$value)+min(chart_data$value)),
                                   max(0,max(chart_data$value) + max(chart_data$value) / 10)))+
    scale_x_discrete(labels = lvl_labs, drop = F)    
  chart
}

and yields

d

As a side note, you had written chart_data = subset... in the geom_text, instead of data = subset....

Community
  • 1
  • 1
RoB
  • 1,833
  • 11
  • 23