2

I would like to make a barplot in R, where the last bar in the graph indicates that last is the sum of all values whose the frequency is greater than a certain threshold. I want to represent this information on x-value correspondent to the last bar. For instance:

library(ggplot2)

x <- c(1, 2, 3, 4, 5)
y <- c(4000, 3000, 2000, 1000, 500)

df <- data.frame(x, y)

names(df) <- c("Var1", "Freq")

theme_set(theme_classic())

g <- ggplot(df, aes(Var1, Freq))
g + geom_bar(stat = "identity", width = 0.5, fill = 'tomato2') + 
  xlab('Var1') +
  ylab('Freq') +
  theme(axis.text.x = element_text(angle = 0, 
                                   vjust = 0.6, 
                                   colour = "black"),
        axis.text.y = element_text(colour = "black"))

The above code produces a chart similar to this:

enter image description here

But on the last bar, I want that last value of x-axis (x = 5) be displayed as >= 5.

So far, I've tried to use scale_x_discrete. So I added to the above code the following lines:

n <- 5

# I'm not very creative with names.
.foo <- function(x, n) {
  if (x == n) {
    element <- paste('\u2265', toString(x), sep = ' ')
  } else {
    element <- toString(x)
  }
}

labels <- sapply(seq(n), .foo, n)

g + scale_x_discrete(breaks = sapply(seq(n), function(x) toString(x)),
                     labels = labels)

This code formats the x-axis as I wish but it overrides the barplot, leaving an empty chart:

enter image description here

How can I do this?

rcs
  • 67,191
  • 22
  • 172
  • 153
marcelo
  • 171
  • 3
  • 8
  • 1
    this might help: https://stackoverflow.com/questions/21646100/how-to-set-expressions-as-axis-text-of-facets-in-ggplot2/21650177#21650177 – user20650 Apr 07 '19 at 21:13

3 Answers3

2

Change the labels in scale_x_continuous:

... + scale_x_continuous(labels=c("0", "1", "2", "3", "4", "\u2265 5"))

plot

rcs
  • 67,191
  • 22
  • 172
  • 153
  • When I changed to `scale_x_continuous` I got the following error: `Erro: Discrete value supplied to continuous scale` – marcelo Apr 07 '19 at 21:33
  • Using `scale_x_discrete(labels=c("0", "1", "2", "3", "4", "\u2265 5"))` did not worked either. – marcelo Apr 07 '19 at 22:05
  • 1
    @marcelo The answer above with `scale_x_continuous` should work, unless your actual `df$Var1` isn't integer / numeric. And are you sure you've assigned the full ggplot object (i.e. `g + geom_bar(...)`) back to `g`? You haven't done that in the code included in the question, & getting a blank plot would be perfectly reasonable in that case. – Z.Lin Apr 08 '19 at 06:45
  • 1
    @marcelo this assumes that the column `Var1` is numeric/integer, as above in the code in your question (I used `ggplot2` 3.1.0). – rcs Apr 08 '19 at 07:18
  • @Z.Lin I assigned the `geom_bar(...)` to ggplot object first and then I used `scale_x_discrete`, changing the order of assigment fixed my problem, i.e., first assign `scale_x_discrete` and then `geom_bar(...)`. Sorry my bad. Thank you all! – marcelo Apr 08 '19 at 15:53
1

One approach would be to avoid changing the axis tick labels directly, but convert your categorical data in Var1 to a factor, then relevel that factor using forcats::fct_lump such that the final factor is ≥5

# Insert after df generated, before plot call
library(forcats)
df <- df %>% 
  mutate(Var1 = as_factor(Var1),
         Var1 = fct_lump_min(Var1, min = 501, w = Freq, other_level = "≥5"))
Brent
  • 425
  • 1
  • 3
  • 10
0

The problem was that, as pointed out by @Z.Lin comment, I was assign the geom_bar(...) to the ggplot object before using scale_x_discret. Here is the solution:


library(ggplot2)
  ...
  labels <- sapply(seq(n), .foo, n)

  g <- ggplot(df, aes(Var1, Freq)) + 
         scale_x_discrete(breaks = sapply(seq(n), function(x) toString(x)),
                          labels = labels)

  g + geom_bar(stat = "identity", width = 0.5, fill = color) + 
  ...
marcelo
  • 171
  • 3
  • 8