1

I produced this wonderful bar plot (see below). To quickly group my countries by region, I added scale_x_discrete(limits = ORDER ) with some empty limits "" (specified by ORDER). It adds empty bars to the plot, which seem to work fine for me, but the axis.ticks are not consistent. I does not add axis.ticks (which I prefer), but for the last empty bar, it does. Why that? How to get rid of this single tick?

ORDER <- c("Kiribati",  "Marshall Islands",  "Palau",  "States of Micronesia",
       "",
       "Micronesia g." ,
       "",
       "Fiji",  "Nauru",  "PNG",  "Solomon Islands",  "Vanuatu",
       "",
       "Melanesia g.",
       "",
       "Cook Islands",  "Niue",  "Samoa",  "Tonga",  "Tuvalu",
       "",
       "Polynesia g."
      )
ORDER

ggplot(ESA_coun_p ,aes(x=x, y=y))+
 geom_col(position="dodge", na.rm=TRUE)+
 scale_x_discrete(limits = ORDER )+
 coord_flip()  

enter image description here thothal & Romain B. gave some great replies for solving the questions, both with their pro and cons.

@thothal: Your suggestion using labels instead of limits make the plot consistent as it adds axis ticks to all empty separation bars. However, it may require hard-coding of some empty extra observations and reordering factors. It also does not distinguish the different groups to well from each other.

@Romain B.: Your suggestion works very well and does distinguish the different groups clearly. However, I ran into difficulties with some more sophisticated plots, a "gap bar plot", which allows to compare values better in case of outliers (see below your example adjusted).

set.seed(10)
test <- data.frame(country = LETTERS[1:12], 
                   region = c(1,1,1,1,2,2,3,4,4,4,5,5), 
                   value = rnorm(12, m = 10))%>%
 mutate(value=replace(value, country=='A', 100))

# I'm ordering by <value> here, so in the plot, they'll be ordered as such 
test$country <- factor(test$country, levels = test$country[order(test$value)])
######
trans_rate_surf <- 0.02   ##play around, defines cropping of the cut of values
white_space_min_surf <- 20 ##littel bit above the last fully displaied bar
white_space_max_surf<- 80 ##littel bit below the first cropped bar
#####  
trans_surf <- function(x){pmin(x,white_space_min_surf) + trans_rate_surf*pmax(x-white_space_min_surf,0)}
yticks_surf <- c(5, 10, 15, 20,  100) ## not within or too close to the white space
##
test$value_t <- trans_surf(test$value)

ggplot(test, aes(x = country, y = value_t)) + geom_bar(stat = 'identity') + coord_flip()+
 geom_rect(aes(xmin=0, xmax=nrow(test)+0.6, ymin=trans_surf(white_space_min_surf), ymax=trans_surf(white_space_max_surf)), fill="white")+
 scale_y_continuous(limits=c(0,NA), breaks=trans_surf(yticks_surf), labels=yticks_surf)

If I add now + facet_grid(rows = vars(region), scales = "free_y", space = "free_y") everything is messed up, because xmax=nrow(test) doesn't fit anymore, but would need to be region sensitive.

enter image description here enter image description here

MsGISRocker
  • 588
  • 4
  • 21

2 Answers2

0

You could have a region variable and facet the plot according to it. You can then play with facet plot spacing.

You didn't provide data, so I made a dummy test dataframe.

set.seed(10)
test <- data.frame(country = LETTERS[1:12], 
                   region = c(1,1,1,1,2,2,3,4,4,4,5,5), 
                   value = rnorm(12, m = 10))

# I'm ordering by <value> here, so in the plot, they'll be ordered as such 
test$country <- factor(test$country, levels = test$country[order(test$value)])

ggplot(test, aes(x = country, y = value)) + geom_bar(stat = 'identity') + 
  facet_grid(rows = vars(region), scales = "free_y", space = "free_y") + coord_flip() +
  theme(panel.spacing = unit(1, "lines")) # play with this to spread more

This yields

enter image description here

While I ordered by value here, you can give the order you want as the levels of your factor.

EDIT : with "gap"

I will put a disclaimer here, that i personally do not think that using plots with axis breaks or gaps is a good idea. This has been extensively discussed on this website before and there are many ways around it (e.g transforming your data, using log scales, building indices, etc.).

Since you're trying to kind of force it in your way, I'll give you another workaround : use a line with a large width.

trans_rate_surf <- 0.02   ##play around, defines cropping of the cut of values
white_space_min_surf <- 20 ##littel bit above the last fully displaied bar
white_space_max_surf<- 80 ##littel bit below the first cropped bar
#####  
trans_surf <- function(x){pmin(x,white_space_min_surf) + trans_rate_surf*pmax(x-white_space_min_surf,0)}
yticks_surf <- c(5, 10, 15, 20,  100) ## not within or too close to the white space
##
test$value_t <- trans_surf(test$value)

ggplot(test, aes(x = country, y = value_t)) + geom_bar(stat = 'identity') + coord_flip() +
  scale_y_continuous(limits=c(0,NA), breaks=trans_surf(yticks_surf), labels=yticks_surf) +
  facet_grid(rows = vars(region), scales = "free_y", space = "free_y") + coord_flip() +
  theme(panel.spacing = unit(1, "lines")) + # play with this to spread more
  geom_hline(yintercept = trans_surf(50), size = 10, color = "white")

The last line of the plot is the only thing I've changed from your post's code. As a results, I get :

enter image description here

RoB
  • 1,833
  • 11
  • 23
  • I like your solution, but I have difficulties ordering my categorical variables in the plot (the countries). If I used ````scale_x_discrete(limits = ORDER )```` as before, it shows all country names for each Region, even though the country may not be part of the region. If use ````scale_x_discrete()````, the country names are ordered by the first appearance in the data frame. – MsGISRocker Dec 03 '19 at 15:29
  • You can always reorder the factor levels themselves in your input dataframe, since ggplot will respect that order. e.g if you have `f` a factor with levels A, B and C, you can do `f <- factor(f, levels = c('B', 'C', 'A'))`. – RoB Dec 03 '19 at 15:43
  • @MrGISRocker I've updated my answer with a reordering of the bars – RoB Dec 03 '19 at 15:48
  • I do not understand what you're trying to do with this "white space". If you want the background to be white, you can change the theme of the plot no ? e.g `+ theme_classic()` – RoB Dec 04 '19 at 11:52
  • I edited my question, as this wasn't to be summarized well in a short comment. – MsGISRocker Dec 04 '19 at 15:25
  • @MrGISRocker I've updated my answer with a workaround. Again, I don't recommend using these broken-axis representations as they can confuse and distort one's view off the data. – RoB Dec 04 '19 at 15:44
  • I know plots with axis breaks or gaps are controversial. While the comparability increases for some bars, it reduces for other bars. All solution for handling outliers in bar plots have their pros and cons. In my case, with one extreme outlier, I went for the gap. – MsGISRocker Jan 03 '20 at 14:15
0

You should use labels instead of limits. Toy example below b/c you did not provide a regrex.

Explanation

With limits you set the, well, limits of the scale. As it is a discrete scale, it expects the unique data points. But your labels are not unique. What you want is to set the labels of the scale, and hence you should use the argument labels.

Data

library(tidyverse)
set.seed(1)
my_dat <- mtcars %>% 
    rownames_to_column() %>% 
    as_tibble() %>% 
    select(rowname, mpg) %>% 
    add_row(rowname = paste0("remove", 1:3), mpg = rep(0, 3)) %>% 
    slice(sample(NROW(.))) %>% 
    mutate(rowname = factor(rowname, rowname))

p <- ggplot(my_dat, aes(x=rowname, y = mpg)) + 
   geom_col(position = "dodge", na.rm=F) + 
   coord_flip()

rn <- gsub("^remove[0-9]+", "", my_dat$rowname)

Wrong Way using limits

p + scale_x_discrete(limits = rn)

Wrong way via limits

Correct Way using labels

p + scale_x_discrete(labels = rn)

Correct Way Using Labels

thothal
  • 16,690
  • 3
  • 36
  • 71
  • I like your solution, but I have difficulties ordering my categorical variables in the plot (the countries). I used ````scale_x_discrete(limits = ORDER )```` also to order how the country names are displayed in the plot. Using ````scale_x_discrete(labels = ORDER)```` causes axes text not matching the bar values. – MsGISRocker Dec 03 '19 at 15:33
  • The order of x axis is determined by the order of your x input and not the order of appearance in your data frame. The standard is to sort it alphabetically. To change that, transform it to a factor and define the order via the levels (that's what I ahve done: `factor(rowname, rowname)` tells `R` to use the order of appearance. Then you could use `levels`, change the unwanted labels to blanks and pass it to `labels` – thothal Dec 03 '19 at 16:49