2

I have this dataset with 4 key features and the aim is to plot them in a single plot so I can analyze them simultaneously but I couldn't be able to achieve the goal. Can please someone guide me?

Sample Data:

dput(head(Mafraq.spi, 30))
structure(list(year = c(1985, 1986, 1986, 1986, 1986, 1986, 1986, 
1986, 1986, 1986, 1986, 1986, 1986, 1987, 1987, 1987, 1987, 1987, 
1987, 1987, 1987, 1987, 1987, 1987, 1987, 1988, 1988, 1988, 1988, 
1988), `SPI 3` = c(1.03, 1.84, 0.06, -0.48, -0.05, 1.4, 1.4, 
0.74, 2.69, 1.79, 1.3, -0.59, -1.11, -1.14, -2.89, -1.4, -0.38, 
1.4, 1.4, 2.17, 0.67, 0.71, 0.03, -0.01, -0.02, 0.26, 0.69, 0.33, 
-0.38, 1.4), `SPI 6` = c(0.32, 0.24, 0.52, 0.91, 1.8, 0.05, -0.53, 
0.46, 2.68, 1.78, 1.29, 1.28, 0.87, 0.67, -1.53, -1.83, -1.33, 
-2.9, -1.4, 1.89, 0.66, 0.7, 0.46, 0.19, 0.38, 0.05, 0.21, -0.09, 
0.11, 0.68), `SPI 9` = c(0.32, 0.17, 0.15, 0.15, 0.21, 0.52, 
0.9, 1.87, 2.63, 1.76, 1.28, 1.27, 0.86, 0.7, 0.65, 0.65, 0.61, 
-1.54, -1.87, -0.32, -1.08, 0.38, 0.39, 0.19, 0.37, 0.41, 0.35, 
0.35, -0.02, 0.21), `SPI 12` = c(0.24, 0.14, 0.15, 0.15, 0.15, 
0.15, 0.15, 0.27, 1.91, 1.93, 1.87, 1.19, 0.75, 0.69, 0.64, 0.64, 
0.64, 0.64, 0.65, 0.94, -1.1, -0.62, -0.23, -0.54, 0.15, 0.35, 
0.34, 0.34, 0.34, 0.34)), row.names = c(NA, -30L), class = c("tbl_df", 
"tbl", "data.frame"), na.action = structure(1:11, names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11"), class = "omit"))

Code:

library(dplyr)

Mafraq.spi <- na.omit(read_excel("K:/talha.R/Project 56/Mafraq-his-SPI-SPEI.xlsx", sheet = "SPI"))
selected_vars <- c("year", "SPI 3", "SPI 6", "SPI 9", "SPI 12")
Mafraq.spi <- Mafraq.spi %>%
  select(all_of(selected_vars)) %>%
  mutate(year = as.Date(paste0(year, "-01-01")))

ggplot(Mafraq.spi, aes(x = year)) +
  geom_line(aes(y = `SPI 12`, color = "SPI 12")) +
  geom_line(aes(y = `SPI 9`, color = "SPI 9")) +
  geom_line(aes(y = `SPI 6`, color = "SPI 6")) +
  geom_line(aes(y = `SPI 3`, color = "SPI 3")) +
  labs(x = "Year", y = "Value", color = "Variable") +
  scale_color_manual(values = c("SPI 12" = "blue", "SPI 9" = "red", "SPI 6" = "green", "SPI 3" = "purple")) +
  theme_minimal()

My Plot: enter image description here

What I wanted: enter image description here

Fix with the help of @r2evans:

Mafraq.spi %>%
  mutate(year = year + (row_number() - 1)/12) %>%
  mutate(year = ifelse(year == 1985, 1985 + 11/12, year)) %>%
  mutate(year = as.numeric(year)) %>%
  pivot_longer(cols = starts_with("SPI"), names_to = "SPI", values_to = "value") %>%
  ggplot(aes(year, value, color = SPI)) +
  geom_line() +
  facet_grid(SPI ~ .) +
  scale_color_manual(values = c("SPI 12" = "blue", "SPI 9" = "red", "SPI 6" = "green", "SPI 3" = "purple")) +
  theme_minimal()

Plot: enter image description here

talha asif
  • 65
  • 5

1 Answers1

2
  1. facet_grid breaks it into individual "swim lanes" or panels.
  2. I'm inferring that since you have twelve observations for two of the years in the sample that the rows are monthly, in which case we can adjust year to be fractional.
  3. This will be easier if we reshape the data from wide to long (see Reshaping data.frame from wide to long format, Transforming wide data to long format with multiple variables). For one, it means we can issue a single geom for all of the different SPI levels.

Adjusted code:

library(dplyr)
library(tidyr) # pivot_longer
library(ggplot2)
Mafraq.spi %>%
  mutate(year = year + (row_number() - 1)/12, .by = year) %>%
  # very special case with this sample data so that 1985 starts at
  # the last month of the year; this may only be useful for this sample
  # data, be careful with "real" data
  mutate(year = if_else(year == 1985, 1985 + 11/12, year)) %>%
  pivot_longer(-year, names_to = "SPI") %>%
  ggplot(aes(year, value, color = SPI)) +
  geom_line() +
  facet_grid(SPI ~ .) +
  scale_color_manual(values = c("SPI 12" = "blue", "SPI 9" = "red", "SPI 6" = "green", "SPI 3" = "purple")) +
  theme_minimal()

ggplot

Options:

  • if you don't want the legend, add scale_color_manual(guide="none", ...);
  • alternatively, since the lines are broken out into facets, you may not need different colors for each SPI; and
  • if you want the facets to say "6 months" instead of "SPI 6" then you can add mutate(SPI = paste(sub("SPI ", "", SPI), "months")) after the pivot and then update the manual colors in your scale_color_manual values.
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Thanks for the assistance but getting this error `Error in `mutate()`: ! Problem while computing `year = if_else(year == 1985, 1985 + 11/12, year)`. Caused by error in `if_else()`: ! `false` must have class `numeric`, not class `Date`.` – talha asif Jun 11 '23 at 09:33
  • That means your sample data is different from your real data. It's an easy fix: make sure `year` is numeric by doing `mutate(year = as.numeric(year))` earlier in the pipe. (You _could_ switch to `ifelse` instead, since it is class-unsafe and will do what we intend here.) (Odd, though ... this should have been done if you did `year + (row_number() - 1)/12`, so it suggests you're taking part of my code, not all of it. Is my assumption about `year` incorrect?) – r2evans Jun 11 '23 at 09:40
  • Yes, I have edited the question with a fix but got a different plot. The distribution of the data is not smoothed as yours. Please have a look @r2evans – talha asif Jun 11 '23 at 09:46
  • Your `year` appears to be not what I inferred it to be. Please clarify _exactly_ what `year` is intended to be. The fact that your x-axis now spans to 15000 says that my "month" assumption is not correct. – r2evans Jun 11 '23 at 10:04
  • The correct timeline of the original data is from 1985 to 2021. – talha asif Jun 11 '23 at 10:06
  • You need to use ALL of the code in my answer. If your version of dplyr is before 1.1, then you need to change `mutate(year = year + (row_number() - 1)/12, .by = year)` to `group_by(year) %>% mutate(year = year + (row_number() - 1)/12)`. You cannot take _part_ of an answer code, discard the rest, and assume that it will work as expected in the provided answer. – r2evans Jun 11 '23 at 10:09