1

When using the dataframe column inside the if else loop in stat_bin, throws object not found even though the dataframe and column exist. Here is a reproducible code

ggplot(mpg, aes(x = displ, fill = trans, label = trans)) +
  geom_histogram(binwidth = 1,col="black") +
  stat_bin(binwidth=1, geom="text", position=position_stack(vjust=0.5), aes(label=ifelse(..count..>4, as.character(trans), "")))

The above code throws following error

*Error in ifelse(count > 4, as.character(trans), "") : object 'trans' not found*

I even tried following with no luck

ggplot(mpg, aes(x = displ, fill = trans, label = trans)) +
geom_histogram(binwidth = 1,col="black") +
stat_bin(binwidth=1, geom="text", position=position_stack(vjust=0.5), aes(label=ifelse(..count..>4, mpg$trans, "")))

I get following error

 *Error in ifelse(count > 4, mpg$trans, "") : object 'mpg' not found*

When I take out the if else and try following, it works fine(its detecting the dataframe and the column name)

ggplot(mpg, aes(x = displ, fill = trans, label = trans)) +
geom_histogram(binwidth = 1,col="black") +
stat_bin(binwidth=1, geom="text", position=position_stack(vjust=0.5), aes(label=as.character(trans)))

What I am missing here?

Gerg
  • 336
  • 4
  • 14
  • Has to do smth with your data.frame. `dput()` it here – Martin Schmelzer Nov 22 '17 at 15:12
  • @MartinSchmelzer Thanks for looking at it.. I dint quite get what you said, mpg is a dataset already available by default. – Gerg Nov 22 '17 at 15:29
  • Why not just ```mutate``` a new column with the results of ```ifelse``` and then use that as your label? This is cleaner anyway (separation of data manipulation from data visualization). – rsmith54 Nov 22 '17 at 15:29
  • @rsmith54 I am quite new to this, can you please show me how to do that in this context? – Gerg Nov 22 '17 at 15:34
  • I realized I'm not quite sure what your desired output is, but something like ```mpg %>% mutate(label_column = ifelse( trans > 4, trans, "")) %>% ggplot() + geom_histogram( aes(x = displ, fill = trans, label = trans), binwidth = 1,col="black") + stat_bin( aes(x = displ, fill = trans, label = label_column))``` ? – rsmith54 Nov 22 '17 at 15:41
  • Thanks @rsmith..Here the "..count..>4" gives the freq and thats required for the labels. To see desired output, say if you replace the second "as.character(trans)" with "..count.." then you will see the graph with the freq as labels, but I want is to have "trans" as label. – Gerg Nov 22 '17 at 15:56
  • There is no `trans` variable in the summary dataset that ggplot2 creates "under the hood". While you can pull out a variety of variables from that dataset (like `..group..` or `..count..`), I don't see a way to pull out a variable about the labels while making labels. Summarizing the dataset prior to plotting is probably your best bet, as shown in the [answer to your last question](https://stackoverflow.com/a/47421642/2461552). – aosmith Nov 22 '17 at 15:57
  • Thanks @aosmith ..But why trans is available when I take out the if else loop ? – Gerg Nov 22 '17 at 16:05
  • 1
    I don't know. Maybe referring to `..count..` triggers ggplot2 to go looking in the `ggplot2_build` dataset instead of the `mpg` dataset? When using `..count..` you can use other created variables in the `ifelse` like `...group`..` or `PANEL` but not any variables from the original plotting dataset. – aosmith Nov 22 '17 at 16:24
  • I agree with you, what you said is a possibility. But do you think soon as we refer ..count.. takes out the scope of all the variables from the original plotting dataset, is by design or a flaw? – Gerg Nov 22 '17 at 16:31
  • The problem is that the `ifelse` statement needs to be evaluated on *one* data frame. There are two potential data frames to choose from: the original input (`mpg`, which contains `trans`) and the one calculated by `stat_bin` (which contains `count`). But there is no data frame that contains both pieces of information. I think this cannot be solved at present from within ggplot2. The only solution is to pre-calculate the bin counts. – Claus Wilke Nov 22 '17 at 20:28

1 Answers1

0

Because of current limitations in ggplot2 (see also my comment to your question), you'll have to pre-calculate the histogram to achieve what you want to do. Like so:

breaks <- 1:7 + .5
mids <- 2:7
mpg %>% group_by(trans) %>%
  do(hist = data.frame(count = hist(.$displ, breaks = breaks, plot = FALSE)$counts,
                       displ = mids)) %>%
  unnest() %>%
  ggplot(aes(x = displ, y = count, fill = trans)) +
  geom_col(position = position_stack(vjust = 0.5), width = 1, color = "black") +
  geom_text(aes(label = ifelse(count > 4, trans, "")), position = position_stack(vjust = 0.5))

enter image description here

Claus Wilke
  • 16,992
  • 7
  • 53
  • 104