1

I want to make a line graph with only the values of a date variable on the x-axis but ggplot is plotting all the values which are filling the missing values in my data which is what I don't want.

This is a part of my data:

f <- structure(list(o = c(
  "2020-01-02", "2020-01-03", "2020-01-06",
  "2020-01-07", "2020-01-08", "2020-01-09", "2020-01-10", "2020-01-13",
  "2020-01-14", "2020-01-15", "2020-01-16", "2020-01-17", "2020-01-21",
  "2020-01-22", "2020-01-23", "2020-01-24", "2020-01-27", "2020-01-28",
  "2020-01-29", "2020-01-30"
), val = c(
  72.83, 75.56, 75.55, 75.98,
  74.84, 77.17, 79.75, 83.72, 84.61, 85.8, 85.89, 83.63, 87.75,
  91.81, 95.06, 100.79, 103.21, 106.62, 99.29, 93.55
), i.hold = c(
  0L,
  0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
  1L, 1L, 1L
), equity.val = c(
  72.83, 72.83, 72.83, 72.83, 72.83,
  72.83, 72.83, 72.83, 72.83, 73.85432, 73.93179, 71.98644, 75.53283,
  79.02756, 81.82508, 86.75731, 88.84038, 91.77562, 85.46615, 80.52531
), ma_5 = c(
  NA, NA, NA, NA, 74.952, 75.82, 76.658, 78.292, 80.018,
  82.21, 83.954, 84.73, 85.536, 86.976, 88.828, 91.808, 95.724,
  99.498, 100.994, 100.692
), ma_10 = c(
  NA, NA, NA, NA, NA, NA,
  NA, NA, NA, 78.581, 79.887, 80.694, 81.914, 83.497, 85.519, 87.881,
  90.227, 92.517, 93.985, 94.76
)), row.names = c(NA, -20L), class = "data.frame")

NA values are not a problem. So, o is my date variable which I want to plot on the x-axis with major breaks(gridlines at x-axis) after 5 dates. For eg- first break at 2020-01-02, second at 2020-01-09, third at 2020-01-16, fourth at 2020-01-24 and so on. I also want minor breaks (gridlines at x-axis) at each date in the data. you can find the dataset here - https://drive.google.com/file/d/1bvys_S4ZoyYBXaD4lXdAtY0GO88mWL79/view?usp=sharing

Here's my code-

ggplot(f, aes(x = o, y = val)) +
  geom_line(colour = "blue", lwd = 1) +
  geom_segment(aes(y = -Inf, yend = Inf, x = f$o, xend = f$o, alpha = f$i.hold),
               inherit.aes = F, colour = "black", size = 5) +
  scale_alpha_continuous(range = c(0, 0.15)) +
  guides(alpha = F) +
  geom_line(aes(y = ma_10), colour = "green", lwd = 1) +
  geom_line(aes(y = ma_5), colour = "red", lwd = 1) +
  geom_line(aes(y = equity.val), lwd = 1) +
  theme_bw() +
  labs(x = "Dates", y = "Price") +
  ggtitle("TXG") +
  theme(plot.title = element_text(hjust = 0.5),
        axis.text.x = element_text(angle = 90),
        panel.grid.major.x = element_line(colour = "black", size = 0.6),
        panel.grid.minor.x = element_line(colour = "black", size = 0.3)) +
  scale_x_date(breaks = seq(as.Date(f$o[1]), as.Date(f$o[length(f$o)]), by = 5),
               minor_breaks = seq(as.Date(f$o[1]), as.Date(f$o[length(f$o)]), by = 1), 
               date_labels = "%Y-%m-%d")

SO what's happening with my code is, gg-plot is showing continuous dates even the ones which are not in the data. Here's the image of my result- My output

And I just want the ones in my date variable. I don't want any extra dates in my plot. I want the dates in the format YYYY-MM-DD.

I have tried the answer here- Breaks for scale_x_date in ggplot2 and R but it didn't work in my case. Any other answers will be appreciated. Thank you in advance.

  • Hi OP. Welcome to SO! Can you please share your dataset via `dput(f)`? The output of that function should start with `structure(...` and can be copied and pasted directly into your question (formatted as code) in place of the text you have posted, which cannot be reproduced easily. If the dataset is too large, you can also copy and paste the output of `dput(head(f, 10))` or something similar. Also, can you post a picture of your plot instead of sharing a google drive link? (which cannot always be accessed) – chemdork123 Jul 15 '20 at 11:38
  • hi welcome to SO. What exactly did not work using the provided solution of the linked thread? Seems very helpful and more or less the same question to me. – tjebo Jul 15 '20 at 22:16
  • Does this answer your question? [Breaks for scale\_x\_date in ggplot2 and R](https://stackoverflow.com/questions/6638696/breaks-for-scale-x-date-in-ggplot2-and-r) [This is an automated comment which came up because I flagged this question as a duplicate question] – tjebo Jul 15 '20 at 22:16
  • Hi, @chemdork123 I have added the image of my output for the code. The data has only 108 rows. I have added an active link for the dataset where you can access the dataset. You can read my input file as a CSV and put it in my code to get the output. – Tanmay Gupta Jul 15 '20 at 23:33
  • Hi @Tjebo I am getting this error - Error in as.Date.numeric(value) : 'origin' must be supplied. My date variable is a date type and it is getting this error. – Tanmay Gupta Jul 15 '20 at 23:40
  • add an origin, when calling as.Date. the origin depends a bit on where your data comes from (Excel, matlab, OSX, etc, they use different origins for dates). See ?as.Date section "Conversion from other Systems" and "Examples" – tjebo Jul 16 '20 at 08:30

1 Answers1

0

Gaps in dates that are not visually represented can be potentially misleading!

Simply add the "limits" argument to your scale_x call, with the unique dates. This requires them to be correctly sorted, but they should be by default, if they are stored in this format.

Also, there are plenty of threads here to show how to combine several lines into one plot - make the data long. You can then more conveniently change the appearance with scale_color_... and also reduce redundant code.

library(tidyverse)

flong <- f %>% pivot_longer(cols = matches("val|ma"), names_to = "key", values_to = "value")

unique_dates <- unique(flong$o)

ggplot(flong, aes(x = o, y = value)) +
  geom_line(aes(color = key, group = key)) +
  scale_x_discrete(limits = unique_dates, breaks = unique_dates) +
  theme(axis.text.x = element_text(angle = 90))
#> Warning: Removed 13 row(s) containing missing values (geom_path).

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • Hi @Tjebo, Thank you for your answer. I want the output with all the dates. I want the major breaks at 5th date and minor at every date. Can you help me with that? – Tanmay Gupta Jul 16 '20 at 16:00
  • @TanmayGupta not sure I understand what you mean when saying "I want the output with all the dates" - thought this was what you wanted to avoid in the first place? – tjebo Jul 17 '20 at 11:59