0

I have some rainfall data collected continuously from which I have calculated daily totals. Here is some toy data:

Date <- c(seq(as.Date("2016-07-01"), by = "1 day", length.out = 10))
rain_mm <- c(3,6,8,12,0,0,34,23,5,1)
rain_data <- data.frame(Date, rain_mm)

I can plot this data as follows:

ggplot(rain_data, aes(Date, rain_mm)) +
  geom_bar(stat = "identity") +
  scale_x_date(date_labels = "%d")

Which gives the following:

enter image description here

This seems fine. It is clear how much rainfall there was on a certain day. However, it could also be interpreted that between midday of one day and midday of the next, a certain amount of rain fell, which is wrong. This is especially a problem if the graph is combined with other plots of related continuous variables over the same period.

To get round this issue I could use geom_step as follows:

library(ggplot)
ggplot(rain_data, aes(Date, rain_mm)) +
  geom_step() +
  scale_x_date(date_labels = "%d")

Which gives:

enter image description here

This is a better way to display the data, and now scale_x_date appears to be a continuous axis. However, it would be nice to get the area below the steps filled but cant seem to find a straight forward way of doing this.

Q1: How can I fill beneath the geom_step? Is it possible?

It may also be useful to convert Date into POSIXct to facilitate identical x-axis in multi-plot figures as discussed in this SO question here. I can do this as follows:

library(dplyr)
rain_data_POSIX <- rain_data %>% mutate(Date = as.POSIXct(Date))

                  Date rain_mm
1  2016-07-01 01:00:00       3
2  2016-07-02 01:00:00       6
3  2016-07-03 01:00:00       8
4  2016-07-04 01:00:00      12
5  2016-07-05 01:00:00       0
6  2016-07-06 01:00:00       0
7  2016-07-07 01:00:00      34
8  2016-07-08 01:00:00      23
9  2016-07-09 01:00:00       5
10 2016-07-10 01:00:00       1

However, this gives a time of 01:00 for each date. I would rather have 00:00. Can I change this in the as.POSIXct function call, or do I have to do it afterwards using a separate function? I think it is something to do with tz = "" but cant figure it out.

How can I convert from class Date to POSIXct so that the time generated is 00:00?

Thanks

Community
  • 1
  • 1
Rory Shaw
  • 811
  • 2
  • 9
  • 26
  • First question is a potential duplicate of: http://stackoverflow.com/questions/21887088/generate-a-filled-geom-step – Artem Sokolov Feb 13 '17 at 17:34
  • seconde question: try `tz="GMT"` or simply remove 1 hour: `as.POSIXct(Date) - 3600`.. – timat Feb 13 '17 at 17:37
  • @ArtemSokolov I had seen that but couldn't really figure the answers out. Also wanted to see if anything had changed recently. Thanks – Rory Shaw Feb 13 '17 at 17:41
  • @timat `tz="GMT"` doesn't have any affect, I get the same 01:00 time. I could just subtract an hours worth of seconds but it doesn't help with my understanding or use of the `as.POSIXct` function for longer code segments – Rory Shaw Feb 13 '17 at 17:43
  • @RoryShaw I added an edit to my response to your second question that adapts the linked example to your data with some additional explanations. – Artem Sokolov Feb 13 '17 at 18:11

2 Answers2

3

For your first question, you can work off this example. First, create a time-lagged version of your data:

rain_tl <- mutate( rain_data, rain_mm = lag( rain_mm ) )

Then combine this time-lagged version with the original data, and re-sort by date:

rain_all <- bind_rows( old = rain_data, new = rain_tl, .id="source" ) %>%
    arrange( Date, source ) 

(Note the newly created source column is used to break ties, correctly interlacing the original data with the time-lagged version):

> head( rain_all )
  source       Date rain_mm
1    new 2016-07-01      NA
2    old 2016-07-01       3
3    new 2016-07-02       3
4    old 2016-07-02       6
5    new 2016-07-03       6
6    old 2016-07-03       8    

You can now use the joint matrix to "fill" your steps:

ggplot(rain_data, aes(Date, rain_mm)) +
  geom_step() +
  geom_ribbon( data = rain_all, aes( ymin = 0, ymax = rain_mm ),
             fill="tomato", alpha=0.5 ):

This produces the following plot:

enter image description here


For your second question, the problem is that as.POSIX.ct does not pass additional arguments to the converter, so specifying the tz argument does nothing.

You basically have two options:

1) Reformat the output to what you want: format( as.POSIXct( Date ), "%F 00:00" ), which returns a vector of type character. If you want to preserve the object type as POSIXct, you can instead...

2) Cast your Date vector to character prior to passing it to as.POSIX.ct: as.POSIXct( as.character(Date) ), but this will leave off the time entirely, which may be what you want anyway.

Community
  • 1
  • 1
Artem Sokolov
  • 13,196
  • 4
  • 43
  • 74
  • thanks for going through that. Looking at the `geom_bar` solution above, do you any idea why I cant specify a `width` argument when used in conjunction with `scale_x_datetime`? – Rory Shaw Feb 13 '17 at 18:21
2

If you would like to avoid the hack, you can customize the position in the geom_bar expression.

I found good results with:

ggplot(rain_data, aes(Date, rain_mm)) +
  geom_bar(stat = "identity", position = position_nudge(x = 0.51), width = 0.99) +
  scale_x_date(date_labels = "%d")

enter image description here

Pierre L
  • 28,203
  • 6
  • 47
  • 69
  • thanks, I was thinking along the same lines by converting `Date` to `POSIXct` and setting the time to 12:00. This makes it easier for me when aligning multiple plots with the same x-aixs. However, `geom_bar` doesn't appear to accept a `width` argument with `scale_x_datetime`... – Rory Shaw Feb 13 '17 at 18:05
  • Also looks better with `width = 1` – Rory Shaw Feb 13 '17 at 18:05
  • I didn't like the way `width = 1` looked. The slight partition is clean and shows a clear daybreak. But you have the tools you need now – Pierre L Feb 13 '17 at 18:06