1

I have a recurrent problem driving me nuts ... I am plotting a ggplot2 using '''geom_area''' with my x-axis as dates. I am trying to separate the dates into equal distances between each other, but I cannot see how... I am attaching my dummy data. The first plot plots my data fine, but my dates are clustered together if the dates are close to each other. I want to make them equidistant as in "option 2", but "date_sr" won't plot my percentage information.

I really appreciate any help you can provide.

require(ggplot2)
library(reshape2)
library(RColorBrewer)

sex <- c('F','F','F',
         'M','M','M')

date <- c("26/11/2018","08/02/2020","08/09/2020", 
          "26/11/2018","08/02/2020","08/09/2020")
         
percentage <- c(40, 30, 20, 60, 70, 80)          


df <- data.frame(sex, date, percentage)
print(df)

#option 1
df$date<- as.Date(df$date,format="%d/%m/%Y")
ourdates<-(unique(df$date))
df

area1 <- ggplot(df, aes(date, percentage,fill=sex)) + 
  geom_area()+
  scale_y_continuous(breaks = seq(0,100,10))+
  scale_x_date(breaks = ourdates, date_labels = "%d %b %Y")+ 
  scale_fill_brewer(labels=c("Female","Male"),palette ="Paired")

plot(area1)



#option 2
df$date<- as.Date(df$date,format="%d/%m/%Y")
mydate<-format(df$date, "%d %b %Y")
date_sr<-factor(mydate, levels = rev(unique(mydate)),ordered = TRUE)

#if we do not re-define date_sr as date it won't plot the graph (but then it won't plot the date in the correct format)
#date_sr<-as.Date(df$date, format="%d/%b/%Y")

area2<-ggplot(df,aes(fill=sex,y=percentage,x=date_sr))+
  geom_area()+
  scale_y_continuous(breaks = seq(0,100,10))+
  scale_fill_brewer(labels=c("Female","Male"),palette ="Paired")

plot(area2)

geom_area plotting sex ratio between females and males. Notice 08/Feb/2020 is closer to 08/Sep/2020. I want the 3 dates to be plotted equidistant from each other and to format dates as "%d %b %Y".

enter image description here

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Irene
  • 35
  • 4

1 Answers1

1

This seems tricky. geom_area does not plot when x is a factor. However, given you want equidistant dates, we can use rank.

sex <- c('F','F','F',
         'M','M','M')

date <- c("26/11/2018","08/02/2020","08/09/2020", 
          "26/11/2018","08/02/2020","08/09/2020")

percentage <- c(40, 30, 20, 60, 70, 80)          


df <- data.frame(sex, 
             as.Date(date, format = "%d/%m/%Y"),
             percentage)

area1 <- ggplot(df, aes(rank(date), percentage,fill=sex)) + 
  geom_area()+
  scale_y_continuous(breaks = seq(0,100,10))+
  scale_x_continuous(breaks = rank(df$date),
                     labels = format(df$date, "%d/%m/%Y")) +
  scale_fill_brewer(labels=c("Female","Male"),palette ="Paired")

plot(area1)
mzuba
  • 1,226
  • 1
  • 16
  • 33
  • Hi mzuba! Thanks for your reply! Yes, this does fix the equidistance of the x-axis; however, you lose the ordering of the dates. 26 Nov 2018 is now after 2020, and I want to keep the correct temporal order of my data (2018, 2020). This is why in option 2, I made "date" a factor, but from your reply, geom_area won't take factors.... what options do I have then? Cheers, – Irene Jun 21 '21 at 13:20
  • 1
    Hi Irene! That's strange, on my machine the date axes are labelled correctly. This is because R can correcty order dates if they are stored as date (in df$date). – mzuba Jun 21 '21 at 14:50
  • Okay! now its working I had to call function " colnames(df) <- c("sex","date","percentage") " and now its working perfect. Thanks for your help! – Irene Jun 21 '21 at 15:17