0

I'm trying to plot a few different histograms with the same x-axis limits. In order to do this, I've extracted the xlimits from the histogram with the widest x-axis limits and am forcing every plot to have these limits. However, when I custom set my limits to this axis range for the same data from which I extracted the limits, it removed 4 data points. They're clearly in the range of the axis limits - in fact, one is nearly right in the middle. Any idea what is causing this behavior? The code below hopefully makes clear what I'm attempting

original <- ggplot(myData, aes (x = Signal, fill = Positivity)) + 
  geom_histogram(alpha = 0.2, position = "identity", color = "black") +
  scale_x_log10() + theme_bw() + xlab("Original Limits")

original_info<-ggplot_build(original) #get plot info
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
xlimlow <- 10^original_info$layout$panel_scales_x[[1]]$range$range[1] # get lower x axis limit, convert because log10
xlimhigh <- 10^original_info$layout$panel_scales_x[[1]]$range$range[2] # get upper x axis limit, convert because log10

hist_custom_limits <- ggplot(myData, aes (x = Signal, fill = Positivity)) + 
  geom_histogram(alpha = 0.2, position = "identity", color = "black") +
  scale_x_log10(limits = c(xlimlow, xlimhigh)) + theme_bw() + xlab("Custom Limits") 

The corresponding graphs are as follows

Original Limits

enter image description here

Here is the output of reprex to provide code that should reproduce the issue

rm(list = ls())
library(ggplot2)

myData <- data.frame(
                 Signal = c(258L, 290L, 470L, 167L, 133L, 183L, 2441L, 225L, 64L, 140L,
                            204L, 398L, 113L, 269L, 838L, 183L, 182L, 440L,
                            107L, 161L, 215L, 408L, 225L, 1920L, 1579L, 150L, 161L,
                            247L, 129L, 537L, 333L, 193L, 161L, 151L, 97L, 730L,
                            258L, 2234L, 129L, 226L, 86L, 343L, 107L, 183L, 226L,
                            236L, 1029L, 7308L, 376L, 140L, 516L, 269L, 204L,
                            483L, 140L, 440L, 333L),
             Positivity = c(0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L,
                            1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L,
                            0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L,
                            1L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 1L,
                            1L, 0L, 0L)
          )
myData$Positivity <- factor(myData$Positivity)
levels(myData$Positivity) = c("Negative", "Positive")


original <- ggplot(myData, aes (x = Signal, fill = Positivity)) + 
  geom_histogram(alpha = 0.2, position = "identity", color = "black") +
  scale_x_log10() + theme_bw() + xlab("Original Limits")

original_info<-ggplot_build(original) #get plot info
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
xlimlow <- 10^original_info$layout$panel_scales_x[[1]]$range$range[1] # get lower x axis limit, convert because log10
xlimhigh <- 10^original_info$layout$panel_scales_x[[1]]$range$range[2] # get upper x axis limit, convert because log10

hist_custom_limits <- ggplot(myData, aes (x = Signal, fill = Positivity)) + 
  geom_histogram(alpha = 0.2, position = "identity", color = "black") +
  scale_x_log10(limits = c(xlimlow, xlimhigh)) + theme_bw() + xlab("Custom Limits") 

Created on 2019-03-12 by the reprex package (v0.2.1)

  • 1
    Could you make your problem reproducible by sharing a sample of your data so others can help (please do not use `str()`, `head()` or screenshot)? You can use the [`reprex`](https://reprex.tidyverse.org/articles/articles/magic-reprex.html) and [`datapasta`](https://cran.r-project.org/web/packages/datapasta/vignettes/how-to-datapasta.html) packages to assist you with that. See also [Help me Help you](https://speakerdeck.com/jennybc/reprex-help-me-help-you?slide=5) & [How to make a great R reproducible example?](https://stackoverflow.com/q/5963269) – Tung Mar 12 '19 at 05:09
  • Have you seen the answer to [this question](https://stackoverflow.com/q/49204576/8449629)? – Z.Lin Mar 12 '19 at 05:14
  • HI Z. Lin, I can use oob= scales:squish to force the data to fit but I don't understand why that's necessary. In the linked question you provided, they are attributing it to rounding error. But even if I set the x lim to be 10^(-5) to 10^(5) I get 4 dropped values unless I use scales:squish. I don't understand why that would be necessary with limits clearly encompassing all of my data. It also provides one bar that is physically squished. It's a workaround - i just don't get why it's needed. – user2017023 Mar 12 '19 at 05:24
  • There may be some margin between the limits for the plot and the limits of the histogram. From what I see, the edge bins of the histogram have open limits. So the data is not so much lost as it is misplaced. Try adding a little buffer to your limits. It should solve the issue – Rohit Mar 12 '19 at 07:11
  • @Tung, I have made the requested changes. Bottom code output should provide reproducible example. – user2017023 Mar 12 '19 at 07:28
  • @rohit, even dividing the xminimum by 100 and multiplying xmax by 10 (i.e. having a huge buffer) still results in 4 data points being excluded, even though the limits at that point are no where close to any data. – user2017023 Mar 12 '19 at 07:50

0 Answers0