0

(Posted this earlier and forgot to include a reproducible example.)

I merged two dataframes with a left join to make a fairly large dataframe. I'm now trying to use ggplot2 to graph two columns from the dataframe, but one of them doesn't seem to be graphing correctly. It ends at x = 400, even though it has plenty of y-values that have an x-value beyond 400.

Here's some sample data. This is a sample of a larger dataframe, so the graph will look very strange.

irradiance <- data.frame(
    lambda = c(337, 337.5, 338, 400, 400.5, 401, 401.5, 650, 650.5, 651),
    date = as.Date("2016-07-19"),
    Local_irrad = c(.159, .175, .182, .315, .326, .335, .342, .248, .246, .248),
    Global_horizn_irradiance = c(.4942, .5295, .5682, 1.232, NA, 1.281, NA, 1.249, NA, 1.326))

lambda  date        Local_irrad  Global_horizn_irradiance
337     7/19/2016   0.159        0.4942
337.5   7/19/2016   0.175        0.5295
338     7/19/2016   0.182        0.5682
400     7/19/2016   0.315        1.232
400.5   7/19/2016   0.326        NA
401     7/19/2016   0.335        1.281
401.5   7/19/2016   0.342        NA
650     7/19/2016   0.248        1.249
650.5   7/19/2016   0.246        NA
651     7/19/2016   0.248        1.326

There are plenty of NA values, but also plenty of "true" values. Maybe the NAs are throwing it off somehow? Here's the graph (might not be exactly the same as your data). As you can see, Global_horizon_irradiance ends at 400: enter image description here

Here's my code:

ggplot(irradiance, aes(x=lambda)) + geom_line(aes(y=Global_horizn_irradiance), color="red") + geom_line(aes(y=Local_irrad), color="blue")
ale19
  • 1,327
  • 7
  • 23
  • 38
  • 5
    A line is a succession of segments, each segment is defined by two consecutive non-NA values, and a NA defines a break. You don't have two non-NA consecutive values after 400. – Stéphane Laurent Jun 16 '17 at 20:20
  • @StéphaneLaurent is correct. I didn't realize the lines would just stop entirely if there weren't 2 consecutive non-NA values-- I thought it would just skip that particular value and then resume later. Thank you! Stéphane, if you post your comment as an answer, I will accept it. – ale19 Jun 16 '17 at 20:41

2 Answers2

2

As @StéphaneLaurent commented, geom_line defines line segments with NA's breaking up segments. You can manually remove the rows with NA's as follows to give a continuous plot:

ggplot(irradiance, aes(x=lambda)) +     
    geom_line(data=subset(irradiance, !is.na(Global_horizn_irradiance)), 
          aes(y=Global_horizn_irradiance), color="red") + 
    geom_line(aes(y=Local_irrad), color="blue")
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
Mosquite
  • 588
  • 1
  • 4
  • 15
1

geom_line ignores NA values, so the x-axis mapping of the red line is broken. If you want them matched, you might have to use geom_point instead:

> ggplot(irradiance, aes(x=lambda)) + 
+ geom_point(aes(y=Global_horizn_irradiance), color="red") + 
+ geom_point(aes(y=Local_irrad), color="blue")
Warning message:
Removed 3 rows containing missing values (geom_point). #notice that your original call doesn't generate this warning

Matt
  • 954
  • 1
  • 9
  • 24