11

I am trying to do a scatter plot with colored by dates. Currently I am doing the following but I have not been able to find a way to get the dates in a good readable format for the legend even though the graph looks the way I want it. I tried formatting them as 20140101 for example but the whole year falls within a small range i.e < 20141231 and I don't get different colors within the year.

data(cars)
cars['dt'] = seq(Sys.Date(),Sys.Date()-980,-20)

ggplot(cars,aes(speed,dist,colour = as.integer(dt))) + geom_point(alpha = 0.6) +
scale_colour_gradientn(colours=c('red','green','blue')) 

Can someone recommend a solution please? To be specific I would like every date to be a different color/shade of a color. (For my actual data I have about 5-6 years of daily data)

enter image description here

hjw
  • 1,279
  • 1
  • 11
  • 25

3 Answers3

11

Getting around the 'origin must be supplied error'... One approach is to make a wrapper for as.Date that has the origin specified, than call this as a labeller.

as.Date_origin <- function(x){
  as.Date(x, origin = '1970-01-01')
}

data(cars)
cars['dt'] = seq(Sys.Date(),Sys.Date()-980,-20)

ggplot(cars,aes(speed,dist,colour=as.integer(dt))) + geom_point(alpha = 0.6) +
  scale_colour_gradientn(name = 'Date', colours=c('red','green','blue'), labels=as.Date_origin)
Tony Ladson
  • 3,539
  • 1
  • 23
  • 30
  • @Tonys Ladson your workaround is very helpful. I have a follow-up question. How do you format the label of the legend to "%d %Y"? I don't know where to do the 'format = ' in any of the arguments. Thanks! – mand3rd Feb 25 '20 at 05:45
  • 1
    Hi @mand3rd. You can do this. In the `as.Date_origin` function change `as.Date(x, origin = '1970-01-01')` to `format(as.Date(x, origin = '1970-01-01'), format = '%d %Y')` – Tony Ladson Feb 25 '20 at 22:07
  • 1
    This also works as an anonymous function: `scale_colour_gradientn(name = 'Date', colours=c('red','green','blue'), labels=function(x) { as.Date(x, origin = '1970-01-01') })` – Jake Fisher Jul 15 '22 at 23:26
10

Just add a labeler function:

ggplot(cars,aes(speed,dist,colour=as.integer(dt))) + geom_point(alpha = 0.6) +
  scale_colour_gradientn(colours=c('red','green','blue'), labels=as.Date)

enter image description here

BrodieG
  • 51,669
  • 9
  • 93
  • 146
0

scale_colour_gradient is mainly used for continuous data, so we should first transform date data to a continuous format using as.integer(). Otherwise, it might raise an error, like "Error: Discrete value supplied to continuous scale".

In many cases, the date column is factor, we need do as.integer(as.Date()) to get the right transformed data. Not a big problem, but sometimes might be tricky. The origin in the self-defined function is 1970-01-01, which is a system attribute. Read as.Date for more information.

data(cars)
cars['dt'] = seq(Sys.Date(),Sys.Date()-980,-20)

trans_date <- function(x){
  as.Date(x, origin = '1970-01-01')
}
    
p <- ggplot(cars,aes(speed,dist,colour=as.integer(as.Date(dt)))) 
p + geom_point(alpha = 0.6) +
    scale_colour_gradientn(name = 'Date', colours=c('red','green','blue'), labels=trans_date)
WEI YAN
  • 1
  • 1
  • It isn't obvious that this answer is different from another already posted answer. – ouflak Jan 22 '22 at 17:29
  • Hi @ouflak. This is not an answer that totally different others. You did not see the value of the differences as you might have had a deep understanding in R and other answers. But I did spend some time to figure out why we can use scale_colour_gradientn to do so and fix the data format to adapt to my case, even if they are just some little tricks. As such, I do not want others to waste a time to fix the same issue and therefore wrote what I have learned to the answer, hoping to help those are also fresh. But I also appreciate that you took a time to look at the answer and made a comment. – WEI YAN Jan 23 '22 at 19:28
  • @ouflak Maybe next time I can make comments to other answers to address my points, although they might not occur in one answer. Thanks! – WEI YAN Jan 23 '22 at 19:28