3

My Problem is related to: Axis labels on two lines with nested x variables (year below months)

However, my data looks a little different.

library(dplyr)

set.seed(122)
df <- as_tibble(rlnorm(1260, meanlog = 0.06, sdlog = 0.20))

df$month <- rep(c("Jan", "Feb", "Mär", "Apr", "Mai", "Jun", 
      "Jul", "Aug", "Sep", "Okt", "Nov", "Dez"), 5, each=21)

df$year <- rep(c("Year 1", "Year 2", "Year 3", "Year 4", "Year 5" ), 1, each=252)

I would like my line graph too look like this, but without the vertical line if possible:

enter image description here

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
Jj Blevins
  • 355
  • 1
  • 13
  • take a look at the answer here: https://stackoverflow.com/questions/20571306/multi-row-x-axis-labels-in-ggplot-line-chart – morgan121 May 28 '19 at 11:29
  • @RAB I tried using this, but I cannot recode it to my problem. The problem for me is that I cannot use the line `lex.order=TRUE` with months and years. – Jj Blevins May 28 '19 at 11:31
  • take out that line then and see what it looks like – morgan121 May 28 '19 at 11:32
  • @RAB It's hard to describe but it doesn't look like what I want. – Jj Blevins May 28 '19 at 11:39
  • 1
    then show us what it looks like, give us the code you used, and tell us which bit doesn't look the way you want to. You need to help us out a little here man, we can't read yor mind – morgan121 May 28 '19 at 13:12
  • @RAB sorry, adjusted my data set. – Jj Blevins May 28 '19 at 13:47
  • You've got multiple values for each month-year combination, yet there is no indicator of the date. Should each value be treated as a date value. or should they be aggregated somehow by month? – acylam May 28 '19 at 14:03
  • Also, your `value` and `month` variables have 1260 elements when you initialize them, whereas `year` only has `252*4=1008`, which is `252` short. This causes an error when you run the last line. – acylam May 28 '19 at 14:07
  • @avid_useR I adjusted the yearly data. Each value can be treated as one day. However I do not want days on my x axis just months. – Jj Blevins May 28 '19 at 14:38
  • Right, so how should the day value be converted to month values? Sum by month? Average by month? You need to provide that logic. You can't expect us to make it up. – acylam May 28 '19 at 14:42
  • They shouldn’t be converted at all. Just like in the picture in my main post. You can still see the daily movements. – Jj Blevins May 28 '19 at 14:44
  • @avid_useR Do you understand what I mean? I want the daily values on the y axis but on the x-axis the months should only display every 21 days. So y(1) = x(1) = Jan, y(2)= x(2) = empty......y(22)=x(22)=feb and so on over 5 years. – Jj Blevins May 28 '19 at 16:50

2 Answers2

2

I can think of two ways to do this, each with their pros and cons:

Data prep:

library(dplyr)
library(tibble)
library(lubridate)
library(scales)
library(ggplot2)

set.seed(122)
df <- as_tibble(rlnorm(1260, meanlog = 0.06, sdlog = 0.20))
df$month <- rep(month.abb, 5, each=21)
df$year <- rep(c("Year 1", "Year 2", "Year 3", "Year 4", "Year 5"), 1, each=252)

# We first create a "real" date variable with year, month and day. I've chosen to add 
# "201" in from of your year, but it really doesn't matter in our case.
df <- df %>%
  group_by(year, month) %>%
  mutate(Date = as.Date(paste0("201", sub("^.+(\\d+)$", "\\1", year),
                               "-", month, "-", row_number()),
                        format = "%Y-%b-%d"))

# Since OP's daily values don't make up full months of data, 
# we need this step to show missing data correctly. 
df <- expand.grid(Date = seq.Date(from = min(df$Date), to = max(df$Date), by = "days")) %>% 
  mutate(year = paste("Year", sub("^\\d{3}(\\d)", "\\1", format(Date, "%Y"))),
         month = format(Date, "%b")) %>%
  left_join(df)

Note that I have used month.abb to replace the months provided by OP, since it looks like they are using a non-English locale.

1. Use facet_grid:

ggplot(df, aes(x = Date, y = value, group = year)) +
  geom_line() +
  facet_grid(. ~ year, scale = "free_x") +
  scale_x_date(labels = date_format("%b"), expand = c(0, 0)) +
  theme(panel.spacing.x = unit(0, "lines")) +
  ylim(c(0, 2.5))

enter image description here

I've used expand in scale_x_date to prevent ggplot from adding spaces on both ends of each facet and panel.spacing.x to reduce the spacing between facets. The combination of these two gives us an illusion that the panels are connected, but they are not (the end of each facet does not connect to the beginning of the next even if there are no missing values)

2. Use geom_rect + geom_text:

# Create labels
label_range <- df %>%
  group_by(year) %>%
  summarize(xmin = min(Date),
            xmax = max(Date),
            ymin = -0.5,
            ymax = ymin + 0.15)

ggplot(df) +
  geom_line(aes(x = Date, y = value)) +
  geom_rect(data = label_range, fill = "lightcoral", color = "#f2f2f2",
            aes(xmin = xmin, xmax = xmax, 
                ymin = ymin, ymax = ymax,
                group = year)) +
  geom_text(data = label_range,
            aes(x = xmin + 365/2, y = ymin + 0.1,
                group = year, label = year)) +
  coord_cartesian(ylim = c(0, 2.5), clip = "off") +
  scale_x_date(labels = date_format("%b"), 
               date_breaks = "1 month",
               expand = c(0.01, 0.01)) +
  theme_bw() +
  theme(plot.margin = unit(c(1,1,3,1), "lines"))

enter image description here

This second method is more tedious, but our data will be treated as one continuous series.

  1. Create label_range to determine the coordinates of the four corners of each geom_rect.

  2. Using this dataset, I plotted the "facet boxes" using geom_rect and their labels using geom_text grouped by year.

  3. We want the rectangles to be below the x-axis, so I used coord_cartesian to set the plot at a specific zoom, which prevents our rects from clipping off when we set it outside the plot.

  4. plot.margin adds some spaces below the x-axis for our facet labels

  5. Notice the gaps between Dec and Jan. They are caused by missing values, which is different than the gaps between Dec and Jan in the first method.

acylam
  • 18,231
  • 5
  • 36
  • 45
  • Due to NAs in column "date" I get the following error: `Error in seq.int(0, to0 - from, by) : 'to' must be a finite number` This happens in – Jj Blevins May 28 '19 at 17:13
  • @JjBlevins Which line are you referring to? Do you mean the `expand.grid` where I create a sequence of dates? – acylam May 28 '19 at 17:29
  • exactly. If I go through the data I can see for example that df$Date[43] = NA – Jj Blevins May 28 '19 at 17:38
  • @JjBlevins I can't reproduce your error. If you run the code I posted exactly without any changes, `df$Date[43]` returns `"2011-02-12"` and `any(is.na(df$Date))` returns `FALSE`. – acylam May 28 '19 at 17:52
  • @JjBlevins I suspect you are still using the non-English months which would cause `as.Date` to return `NA`. Notice that I used `month.abb` instead of your list of months. – acylam May 28 '19 at 17:54
  • This is weird. I did copy exactly your code and it produced this error. – Jj Blevins May 28 '19 at 18:01
  • @JjBlevins Another reason I can think of is because you are using a different locale, `month.abb` doesn't work the way it should. Try replace `month.abb` back with the original list of months you used and see if that helps. Or you can set your system locale with `Sys.setlocale("LC_TIME", "C")` – acylam May 28 '19 at 18:03
1
library(tidyverse)

#data:
set.seed(122)
df <- as_tibble(rlnorm(1260, meanlog = 0.06, sdlog = 0.20))
#> Warning: Calling `as_tibble()` on a vector is discouraged, 
#> because the behavior is likely to change in the future. 
#> Use `tibble::enframe(name = NULL)` instead.

df$month <- rep(c("Jan", "Feb", "Mär", "Apr", "Mai", "Jun", 
                  "Jul", "Aug", "Sep", "Okt", "Nov", "Dez"), 5, each=21)

df$year <- rep(c("Year 1", "Year 2", "Year 3", "Year 4", "Year 5" ), 1, each=252)

#solution:
month_lab <- rep(unique(df$month), length(unique(df$year)))

year_lab <- unique(df$year)

df %>%
  as.data.frame() %>%
  rename(price = 1) %>% 
  mutate(rnames = rownames(.)) %>% 
  ggplot(aes(x = as.numeric(rnames), y = price, 
             group = year)) +
  geom_line() +
  labs(title = "Stock Price Chart", y = "Price", x = "date") +
  scale_x_continuous(breaks = seq(1, 1260, by = 21), 
                     labels = month_lab, expand = c(0,0)) +
  facet_grid(~year, space="free_x", scales="free_x", switch="x") +
  theme(strip.placement = "outside",
        strip.background = element_rect(fill=NA,colour="grey50"),
        panel.spacing=unit(0,"cm"))

![](https://i.imgur.com/QnCsmNd.png)

Created on 2019-05-28 by the reprex package (v0.3.0)

M--
  • 25,431
  • 8
  • 61
  • 93
  • If you want to use `theme_bw()` and have borders but not the border between facets (the vertical line in your question) then look into this thread https://stackoverflow.com/questions/27667017/removing-right-border-from-ggplot2-graph ... however, you can get rid of all borders by adding `panel.border = element_blank()` into `theme`. – M-- May 28 '19 at 19:32
  • thank you once again. I adjusted this answer to the data in my previous post but I have a break in the line-graph betwen year 1 and year 2. Any idea where this might come from? – Jj Blevins May 28 '19 at 20:44
  • @JjBlevins You did this ```rbind(df[0,],c(1, "Jan", "Year 1"), df[1:nrow(df),])```? – M-- May 28 '19 at 20:47
  • If you adjusted it like before (I outlined it in my comment) then the code above works for me. I just need to change `y=Price` to `y=as.numeric(Price)` in the `aes` since adding a row change the class of that column to character. – M-- May 28 '19 at 20:53
  • yes, but now I removed the first row again. So I have 1260 rows as in the above example. Still there is a break in the line exactly between year 1 and year 2. – Jj Blevins May 28 '19 at 20:56
  • @JjBlevins What do you mean by break? I included the picture from reprex package so it should be reproducible. (if you click on the link below the picture you can see how it works). Elaborate on what you mean by break! Is it only between year 1 and year 2? Also note that `exapand=c(0,0)` is not optional anymore. – M-- May 28 '19 at 20:58
  • By break I mean the line is not continous in exactly this one place. It stops there is some empty space and then it starts again. The column price has no NA though. – Jj Blevins May 28 '19 at 21:00
  • @JjBlevins Can you share a reproducible example? I shared with you a link earlier to this thread: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – M-- May 28 '19 at 21:02
  • `df$price <- c(rep(100,252), rep(110,252), rep(120,252), rep(130,252),rep(140,252))` for this price column the same happens in some way. The lines jump up but the line graph should be one continous line without breaks. – Jj Blevins May 28 '19 at 21:10
  • @JjBlevins Wait. Of course they are not connected! they are in different facets!!! facets are multiple plots. We just removed the space between those multiple plots. you cannot connect points between two different plots. – M-- May 28 '19 at 21:20
  • @JjBlevins There's a hack described here: https://stackoverflow.com/questions/31690007/ggplot-drawing-line-between-points-across-facets you need to connect last point of each year to the first point of next year! – M-- May 28 '19 at 21:22
  • thanks. I might need to switch to excel for this one better? – Jj Blevins May 28 '19 at 21:28
  • @JjBlevins Well it's your call. You can also read this https://stackoverflow.com/questions/16680063/how-can-i-add-a-table-to-my-ggplot2-output and use it along side my answer to your previous question. – M-- May 28 '19 at 21:30
  • another thing I noticed is that that in your example above January always has a white vertical grid line. However, using my data-set the line is missing. Do you know what this might be about and if its somehow connected to the "breaks"? – Jj Blevins May 28 '19 at 21:33
  • @JjBlevins your default setting of ggplot is different because of different version. What happens is that we have two plot adjacent of each other so if they have a border which its color is, for example, red, then we see a red line at the end/beginning of each plot which here end and beginning would be January. – M-- May 28 '19 at 21:39
  • is there a way to adjust my setting to yours? – Jj Blevins May 28 '19 at 22:01
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/194080/discussion-between-m-m-and-jj-blevins). – M-- May 28 '19 at 22:06