2

When plotting multiple variables at once using ggplot2 in R, how can I omit NA entries so that the line graph is connected?

As you can see my dataset is riddled with NA entries. (I am using my weightlifting training to learn how to use R. As I am not doing the same exercise each workout, I am bound to have NA entries.) I know that I can omit lines with NA entries, but that would delete every line in my dataframe.

Is it possible to plot my dataframe as a continous line plot without line breaks?

ss <- structure(list(date = structure(c(18421, 18423, 18425, 18428, 
18431, 18435), class = "Date"), bw = c(NA, NA, NA, 95.4, NA, 
NA), squat = c(40, 42.5, 45, 47.5, 50, 52.5), deadlift = c(60, 
NA, 62.5, 65, 67.5, 70), press = c(25, NA, 27.5, NA, 30, NA), 
    bench = c(NA, 40, NA, 42.5, NA, 45), chinup = c(NA, NA, NA, 
    NA, NA, NA), clean = c(NA, NA, NA, NA, NA, NA)), row.names = c(NA, 
-6L), class = "data.frame")

This is the code I have written so far. First I am trying to clean my data using tidyverse . Then I want to plot the dataset in one ggplot.

ss <- ss %>%
  select(date, bw, squat, deadlift, press, bench, chinup, clean) %>%
  gather(key = "lift", value = "weight", -date, -bw)

ggplot(ss, aes(x = date, y = weight)) + 
  geom_point(aes(color = lift)) +
  geom_line(aes(color = lift, linetype = lift))

My plot as of now. Oh the gaps!...

stefan
  • 90,330
  • 6
  • 25
  • 51
kaos
  • 53
  • 5
  • Drop the NAs after gathering, e.g. `filter(!is.na(weight)` – stefan Jul 02 '20 at 21:02
  • @stefan That worked, but I had to exclude ```bw``` from the gather as it contained mostly NAs. Thanks you for your help. (Would you like to write an answer or should I answer my question myself?) – kaos Jul 02 '20 at 21:13
  • Does this answer your question? [Connecting across missing values with geom\_line](https://stackoverflow.com/questions/9617629/connecting-across-missing-values-with-geom-line) –  Jul 02 '20 at 21:14
  • @kaos. Yep. Sorry. The `drop_na` was a bit to radical. Filter is the way to go. – stefan Jul 02 '20 at 21:16
  • @stefan Ah, perfect. Thank you! – kaos Jul 02 '20 at 21:19
  • @Adam Yes, it seems to do the same thing, although I think stefans answer is easier to understand and uses the coding ansatz I provided. But thank you anyway for your effort! – kaos Jul 02 '20 at 21:22
  • @kaos yes his answer is definitely a needed update using `tidyverse` and I prefer it as well. But I think it is useful to link the questions as they ask essentially the same thing. –  Jul 02 '20 at 21:25
  • @Adam Sure, thank you. – kaos Jul 02 '20 at 21:28

2 Answers2

4

Simply drop the NAs after gathering with filter(!is.na(weight):

library(dplyr)
library(tidyr)
library(ggplot2)

ss <- ss %>%
  select(date, bw, squat, deadlift, press, bench, chinup, clean) %>%
  gather(key = "lift", value = "weight", -date, -bw) %>% 
  filter(!is.na(weight))

ggplot(ss, aes(x = date, y = weight)) + 
  geom_point(aes(color = lift)) +
  geom_line(aes(color = lift, linetype = lift))

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Perfect! I am using ```R``` for a while now, but only now I realise how much easier everything gets when using piping and the tidyverse. – kaos Jul 02 '20 at 21:19
2

Probably less clean of an approach, but the lines will be connected on the x axis. So you can reverse the x and y in your plot, and then flip the coordinates as well.

ggplot(ss, aes(x = weight, y = date)) + 
  geom_point(aes(color = lift)) +
  geom_line(aes(color = lift, linetype = lift)) +
  coord_flip()