0

I want to create a time series plot showing how two variables have changed overtime and colour them to their appropriate region?

I have 2 regions, England and Wales and for each I have calculated the total_tax and the total_income.

I want to plot these on a ggplot over the years, using the years variable.

How would I do this and colour the regions separately?

I have the year variable which I will put on the x axis, then I want to plot both incometax and taxpaid on the graph but show how they have both changed over time?

How would I add a 3rd axis to get the plot how these two variables have changed overtime?

I have tried this code but it has not worked the way I wanted it to do.

ggplot(tax_data, filter %>% aes(x=date)) +
  geom_line(aes(y=incometax, color=region)) +
  geom_line(aes(y=taxpaid, color=region))+
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
josh
  • 73
  • 1
  • 5
  • 1
    `ggplot` deliberately makes it difficult to have a secondary `y` axis because it is generally used poorly. You can see [this FAQ for some workarounds](https://stackoverflow.com/q/3099219/903061). – Gregor Thomas Apr 24 '20 at 15:00

1 Answers1

1

ggplot is at the beginning a bit hard to grasp - I guess you're trying to achieve something like the following:

Assuming your data is in a format with a column for each date, incometax and taxpaid - I'm creating here an example:

library(tidyverse)

dataset <- tibble(date = seq(from = as.Date("2015-01-01"), to = as.Date("2019-12-31"), by = "month"),
                  incometax = rnorm(60, 100, 10),
                  taxpaid = rnorm(60, 60, 5))

Now, for plotting a line for each incometax and taxpaid we need to shape or "tidy" the data (see here for details):

dataset <- dataset %>% pivot_longer(cols = c(incometax, taxpaid))

Now you have three columns like this - we've turned the former column names into the variable name:

# A tibble: 6 x 3
  date       name      value
  <date>     <chr>     <dbl>
1 2015-01-01 incometax 106. 
2 2015-01-01 taxpaid    56.9
3 2015-02-01 incometax 112. 
4 2015-02-01 taxpaid    65.0
5 2015-03-01 incometax  95.8
6 2015-03-01 taxpaid    64.6

this has now the right format for ggplot and you can map the name to the colour of the lines:

ggplot(dataset, aes(x = date, y = value, colour = name)) + geom_line()
Wolfgang Arnold
  • 1,252
  • 8
  • 17