2

Has anyone come across a "Removed XXXX rows containing missing values" in which the value is lower than expected? I have more points out of the range than what the warning is giving me.

I have a large dataset (63000 rows) with more than 7000 values lower than 24, which is my lower y-axis limit but get the message saying it is only ~3000.

I haven't seen any info on it anywhere, so would appreciate if anyone who has had this problem before can tell me why this is happening. Thanks!

Here is now a simpler example provided by @JonSpring in his comment:

ggplot(data.frame(X2 = 1:5, X3 = c(30, 31, 20, 40, 41)), aes(X2, X3)) + 
geom_point() + 
geom_line() + 
scale_y_continuous(limits = c(25, 50))
user438383
  • 5,716
  • 8
  • 28
  • 43
  • 4
    Welcome to SO, user22483480! I understand completely that you cannot share the actual data, though that does not release you from the expectation of providing something. Please try to create a representative dataset or use a public dataset and find limits/options that evidence the same warning you're seeing there. – r2evans Sep 01 '23 at 12:37
  • 1
    For some discussions on how to provide sample data (e.g., `dput`, `data.frame`, `read.table`), please see https://stackoverflow.com/q/5963269 , [mcve], and https://stackoverflow.com/tags/r/info. Thank you! – r2evans Sep 01 '23 at 12:52
  • 2
    Hi! Sorry, you are absolutely right. I think I managed to provide an example. Hope this is enough. – user22483480 Sep 01 '23 at 13:53
  • 1
    Sorry, I was new and this went closed fast. As an update, I think it may have something to do with geom_line() still plotting part of data. When I use geom_point() instead, I get the right numbers. If anyone could still comment anyway, I'd be grateful. – user22483480 Sep 01 '23 at 14:19
  • That's a good example, looking at it @user22483480 – r2evans Sep 01 '23 at 14:20
  • @Mark, please consider reopening with the recent edit (and I can reproduce the issue). – r2evans Sep 01 '23 at 14:24
  • Possibly coincidence, but `summary(d$X4 < 24 & lead(d$X4) < 24 & lag(d$X4) < 24)` is also 2, so I wonder if the reporting for `geom_line` observations excluded is based on looking at whether the point and the adjacent points on either side are also within range. – Jon Spring Sep 01 '23 at 16:48
  • 1
    Smaller example confirming that the filtering warnings for `geom_line` are not straightforward. Your example doesn't need theming, secondary axes, x limits, extraneous columns and layers, or more data points than we can grok at a glance. Here I have five points, middle one out of range, so two of the four lines not plotted -- with no warning of any missing. `ggplot(data.frame(X2 = 1:5, X3 = c(30, 31, 20, 40, 41)), aes(X2, X3)) + geom_point() + geom_line() + scale_y_continuous(limits = c(25, 50))`. This produces `Warning message: Removed 1 rows containing missing values ('geom_point()'). ` – Jon Spring Sep 01 '23 at 17:05
  • I think this is an area for potential improvement, so I have entered an issue for ggplot2 here: https://github.com/tidyverse/ggplot2/issues/5405 – Jon Spring Sep 01 '23 at 17:30
  • Yes, you are right - thanks! Sorry for overdoing the example, at first I had no idea where the problem was and secondary axis variable was the one giving me trouble, thought it could be the transformation. Will make it simpler next time :) Thanks again! – user22483480 Sep 01 '23 at 18:33
  • @r2evans thanks for that! :-) also thanks OP for adding code + data to your question – Mark Sep 01 '23 at 19:51
  • 1
    The best way to say "thank you" would be to go back and simplify your example to the one proposed by @JonSpring: then the question would be simple and self-contained, and the answer could just point out that it's complicated. – Ben Bolker Sep 01 '23 at 20:22
  • Maybe `geom_line` only warns if *both* endpoints of a segment are missing? – Ben Bolker Sep 01 '23 at 20:24
  • I thought so, but that's not it. (Or at least that's not all of it -- I tried adding more out-of-range points in the middle of my example and it didn't trigger a warning.) – Jon Spring Sep 01 '23 at 21:27

0 Answers0