0

I am creating a bar graph with continuous x-labels of 'Fiscal Years', such as "2009/10", "2010/11", etc. I have a column in my dataset with a specific Fiscal Year that I would like the x-labels to begin at (see example image below). Then, I would like the x-labels to be every continuous Fiscal Year until the present. The last x-label should be "2018/19". When I try to set the limits with scale_x_continuous, I receive an error of Error: Discrete value supplied to continuous scale. However, if I use 'scale_x_discrete', I get a graph with only two bars: my chosen "Start" date and the "End" of 2018/19.

Start<-Project_x$Start[c(1)]
End<-"2018/2019"

ggplot(Project_x, (aes(x=`FY`, y=Amount)), na.rm=TRUE)+
geom_bar(stat="identity", position="stack")+
scale_x_continuous(limits = c(Start,End))

` Error: Discrete value supplied to continuous scale `

Thank you.

My data is:

df <- data.frame(Project = c(5, 6, 5, 5, 9, 5), 
             FY = c("2010/11","2017/18","2012/13","2011/12","2003/04","2000/01"),
             Start=c("2010/11", "2011/12", "2010/11", "2010/11", "2001/02", "2010/11"),
             Amount = c(500,502,788,100,78,NA))

To use the code in the answer below, I need to base my Start_Year off of my Start column rather than the FY column, and the graph should just be for Project #5.

as.tibble(df) %>% 
mutate(Start_Year = as.numeric(sub("/\\d{2}","",Start)))
xlabel_start<-subset(df$Start_Year, Project == 5)
xlabel_end<-2018
filter(between(Start_Year,xlabel_start,xlabel_end)) %>%
  ggplot(aes(x = FY, y = Amount))+
  geom_col()

When running this, my xlabel_start is NULL.

enter image description here

Jessica Marie
  • 189
  • 1
  • 2
  • 14
  • can you provide a reproducible example of your dataset (see here: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example, somehting like the output of `dput(Project_x)`) – dc37 Mar 27 '20 at 16:34

1 Answers1

1

In ggplot, continuous is dedicated for numerical values. Here, your fiscal year are character (or factor) format and so they are considered as discrete values and are sorted alphabetically by ggplot2.

One possible solution to get your expected plot is to create a new variable containing the starting year of the fiscal year and filter for values between 2010 and 2018.

But first, we are going to isolate the project and the starting year of interest by creating a new dataframe:

library(dplyr)

xlabel_start <- as.tibble(df) %>% 
  mutate(Start_Year = as.numeric(sub("/\\d{2}","",Start))) %>%
  distinct(Project, Start_Year) %>%
  filter(Project == 5)

# A tibble: 1 x 2
  Project Start_Year
    <dbl>      <dbl>
1       5       2010

Now, using almost the same pipeline, we can isolate values of interest by doing:

library(tidyverse)

as.tibble(df) %>% 
  mutate(Year = as.numeric(sub("/\\d{2}","",FY))) %>%
  filter(Project == 5 & between(Year,xlabel_start$Start_Year,xlabel_end))

# A tibble: 3 x 5
  Project FY      Start   Amount  Year
    <dbl> <fct>   <fct>    <dbl> <dbl>
1       5 2010/11 2010/11    500  2010
2       5 2012/13 2010/11    788  2012
3       5 2011/12 2010/11    100  2011

And once you have done this, you can simply add the ggplot plotting part at the end of this pipe sequence:

library(tidyverse)

as.tibble(df) %>% 
  mutate(Year = as.numeric(sub("/\\d{2}","",FY))) %>%
  filter(Project == 5 & between(Year,xlabel_start$Start_Year,xlabel_end)) #%>%
  ggplot(aes(x = FY, y = Amount))+
  geom_col()

enter image description here

Does it answer your question ?

dc37
  • 15,840
  • 4
  • 15
  • 32
  • Thank you, but I want to avoid manually entering years in the filter function because I will be creating multiple graphs in a loop for projects that have varying Start Years. – Jessica Marie Mar 27 '20 at 17:04
  • You can definitely pass the starting and ending year as variable before the filter function and reuse them as you wish. I edited my answer accordingly. Let me know if it is working for you. – dc37 Mar 27 '20 at 17:06
  • Thank you. I edited my answer to specify that I need the start date to be based off of the 'Start' column rather than 'FY'. I expanded the reproducible data example as well. – Jessica Marie Mar 27 '20 at 17:41
  • Based on what you posted in your edited question, you mis-understood the role of pipes and you introduce some function in the middle that break the sequence and led to a NULL result. I edited my answer to provide a way to get what you are looking for. Let me know if it is working for you. – dc37 Mar 27 '20 at 17:54
  • Yes!! Thank you for putting up with my beginner knowledge :) – Jessica Marie Mar 27 '20 at 18:06