Plotting two data frames with different Y scales in the same line graph in R

Question

I have two graphs with the same X axis (Date) but with different Y axis:

df1: crime rate occurrence for 5 types of crime across 9 regions in England

structure(list(
  Date = c("2019-04", "2019-04", "2019-12", "2020-02", 
           "2019-09", "2019-10", "2020-05", "2020-07", "2019-07", "2019-05"), 
  Region = structure(c(7L, 1L, 3L, 7L, 7L, 7L, 3L, 7L, 1L, 7L), 
 .Label = c("South East", "South West", "London", "East of England", "East Midlands", "West 
            Midlands", "Yorkshire and The Humber", "North East", "North West"), class = "factor"), 
  Crime = c("Robbery", "Robbery", "Robbery", "Robbery", "Anti-social behaviour", 
            "Anti-social behaviour", "Anti-social behaviour", "Anti-social behaviour", 
            "Robbery", "Anti-social behaviour")), 
 row.names = c(NA, -10L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 
 0x000001cfda8e1ef0>, index = integer(0))

df2: unemployment rate fluctuation across the same 9 regions (same time period too)

structure(list(
  Date = c("2020-02", "2019-12", "2019-10", "2020-10", "2020-06", "2019-11", "2019-07", "2020-05", 
           "2020-08", "2020-06"), 
  Region = structure(c(1L, 8L, 10L, 8L, 9L, 8L, 9L, 10L, 8L, 3L), 
  .Label = c("England", "South East", "South West", "London", "East of England", "East Midlands", 
             "West Midlands", "Yorkshire and The Humber", "North East", "North West"), 
  class = "factor"), 
  Unemployment.rate = c(4.04317280091498, 4.47398990035041, 3.99786361805527, 5.15177120913334, 
                      5.16820059074221, 4.34062792253313, 4.97071907922267, 3.79490967669574, 
                      4.16298001615593, 3.57267916967994)), 
  row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))

Desired output:

Plotting df2 line graph for each region on their respective region in df1 (with df2 Y axis scale on the right side of the graph)

df1 line graph:

df2 line graph:

The first graph in df2 represents the unemployment rate across England, it's not necessary to plot that on the df1 line graph.

Any help would be greatly appreciated, thanks in advance!

Do you want [this](https://stackoverflow.com/a/39805869/8245406)? As for removing the England line, maybe `data = subset(df2, Region != "England")`. — Rui Barradas, Jan 07 '21 at 15:51
Also that haha I need to figure out how to overlay the graphs of df2 in their respective regions of df1 first :) From what I've seen what you linked only helps with adding a second Y axis scale (which I really appreciate btw!) — Lactuca, Jan 07 '21 at 15:54
As posted, your `df1` does **not** have the crime rates column, only the types of crime. — Rui Barradas, Jan 07 '21 at 16:03
I used the following code to count the types of crime and plot them: crimedata %>% count(Region, Month, Crime, name = 'Crime_occurrencies') %>% mutate(Date = as.Date(paste0(Month, '-01'))) %>% — Lactuca, Jan 07 '21 at 16:06
You can pipe that to `full_join(df2 %>% filter(Region != "England"))` and then to `ggplot`. Your data, as posted, still doesn't allow for a graph to be plotted. Can you post datasets that have Region, Date in common? — Rui Barradas, Jan 07 '21 at 16:52
Unfortunately I can't merge the two data frames because the Crime rate dataset has about 8 million rows and the Unemployment one only 260. That's because in the crime dataset the same month is repeated for the frequency of each type of crime committed and for each region. In the unemployment data set, the month is repeated once for each region (as it represents the unemployment % for that region, in that month). So even though both datasets share the same time period and regions, the number of rows with the same month differs greatly — Lactuca, Jan 07 '21 at 17:04
Don't you aggregate the crime rate data before plotting? If so it won't have so many rows and maybe you can merge them. Another option is to use the argument `geom_line(data = df2` in the call to plot the second data set. — Rui Barradas, Jan 07 '21 at 17:11
I tried the geom_line(data = df2) but it gives me the following error " Aesthetics must be either length 1 or the same as the data (990): x ". I don't understand why since I am using the same x axis for both df1 and df2 `ggplot() + geom_line(mapping = aes(x = crimedata$Date, y = Crime_occurrencies), stat = "identity", color = crimedata$Region, group = crimedata$Region) + geom_line(mapping = aes(x = crimedata$Date, y = Unemployment_data$Unemployment.rate)` — Lactuca, Jan 07 '21 at 17:16
In `ggplot` don't use `crimedata$` or `Unemployment_data$` in `aes()`, use it only as the data argument. — Rui Barradas, Jan 07 '21 at 19:17
If I don't it says "Error in FUN(X[[i]], ...) : object 'Unemployment.rate' not found" — Lactuca, Jan 07 '21 at 19:49
Because you are not using the data argument: `ggplot(data = crimedata, aes(etc))`. Do **not** start plots with an empty `ggplot`. — Rui Barradas, Jan 07 '21 at 20:25
I ended up merging the two datasets, but I get the same issue even if I specify " `data=` " in ggplot() `merged.data %>% count(Region, Month, Crime, name = 'Crime_occurrencies') %>% mutate(Date = as.Date(paste0(Month, '-01'))) %>% ggplot(data = merged.data, aes(Date, Crime_occurencies, color = Region)) + geom_line()+ geom_line(mapping = aes(x = Date, y = Unemployment.rate))` — Lactuca, Jan 07 '21 at 20:32

Plotting two data frames with different Y scales in the same line graph in R

0 Answers0