1

I have the following data (this is a head of 10 rows from my data which has 43 days and 24/day hours data):

structure(list(Agency.Station.ID = c("MI270N003.6D", "MI270N004.7D", 
"MI270N005.7D", "MI270N007.3D", "MI270N008.5D", "MI270N003.6D", 
"MI270N004.7D", "MI270N005.7D", "MI270N007.3D", "MI270N008.5D"
), Date = structure(c(16526, 16526, 16526, 16526, 16526, 16526, 
16526, 16526, 16526, 16526), class = "Date"), Time = c("00:00:00", 
"00:00:00", "00:00:00", "00:00:00", "00:00:00", "00:05:00", "00:05:00", 
"00:05:00", "00:05:00", "00:05:00"), Speed = c(59, 34, 46, 61, 
46, 58, 39, 51, 36, 52), Precipitation = c(0, 0, 0, 0, 0, 0, 
0, 0, 0, 0)), row.names = c(NA, 10L), class = "data.frame")

I want to make a pdf file for each "Agency.Station.ID". Each page in the pdf represents a "Date". In each page plot "Time VS Speed and Precipitation), is that possible?

mustafa
  • 203
  • 1
  • 8
  • Seems like you want to save all plots into one pdf with each page = one plot? [Seems there's a good answer posted on this question](https://stackoverflow.com/questions/20500706/saving-multiple-ggplots-from-ls-into-one-and-separate-files-in-r). – chemdork123 May 12 '20 at 03:06
  • @chemdork123. I have 5 (Agency.Station.Id). each one will be a separate pdf. the pages on each pdf represent a Date (I have 43 days). on each page there is (Time VS Speed and Precipitation) plot. – mustafa May 12 '20 at 04:50
  • @chemdork123, that means I will have 43 pages on each pdf. – mustafa May 12 '20 at 04:51

1 Answers1

2

Without your full dataset, it's not possible to show a complete working solution. With that being said... here's a complete working solution with a dummy dataset. First, the dataset:

library(lubridate)
library(ggplot2)
library(tidyr)
library(dplyr)
set.seed(1234)
df1 <- data.frame(
  Agency.Station.ID=rep('Station1',30),
  Date=c(rep(as.Date('2020-01-01'),10), rep(as.Date('2020-02-01'),10), rep(as.Date('2020-03-01'),10)),
  Time=rep(1:10, 3),
  Speed=c(rnorm(10,40,2), rnorm(10,45,2), rnorm(10,56,1)),
  Precipitation=rep(runif(10,0,30), 3)
)

df2 <- data.frame(
  Agency.Station.ID=rep('Station2',30),
  Date=c(rep(as.Date('2020-01-02'),10), rep(as.Date('2020-03-01'),10), rep(as.Date('2020-04-01'),10)),
  Time=rep(1:10, 3),
  Speed=c(rnorm(10,45,11), rnorm(10,35,7), rnorm(10,53,3)),
  Precipitation=rep(runif(10,0,30), 3)
)

df <- rbind(df1, df2)

It's a bunch of nonsense, but it will work for our purposes. I have two stations: "Station1" and "Station2". Each station here has different dates, with one overlapping. I'm not sure if your dataset has all stations with the same dates, but this code should work in either case.

The basic idea is to use the method shown in this answer to make the pdf files. It is setup as written to create one *.pdf file for each value df$Agency.Station.ID, with one page per plot, and one plot for each value of df$Date. I'm plotting the values of df$Speed and df$Precipitation as line geoms vs. df$Time.

Prepare your dataset

First of all, you should adjust your data to be able to create the intended plot (as you indicate in your question). This means that you will need to gather() df$Precipitation and df$Speed into two columns: one for the "Aspect" (either "Precipitation" or "Speed"), and one for the actual "Value". We will also want to ensure df$Agency.Station.ID and df$Date are converted to factors, since we will be stepping through their levels() to create the plots and pdfs:

df <- df %>% gather(Aspect, Value, -Agency.Station.ID, -Date, -Time)
df$Agency.Station.ID <- factor(df$Agency.Station.ID)
df$Date <- factor(df$Date)

The Function

For the function to create the plots, the plan is to have two nested for loops. The outer loop steps through each level in df$Agency.Station.ID and creates a separate pdf file for each. The inner loop steps through each level in df$Date and creates a plot for each:

for(i in levels(df$Agency.Station.ID)) {
  pdf(paste0(i,'.pdf'))

  for(j in levels(df$Date)) {
    subset.data <- df[which(df$Agency.Station.ID==i & df$Date==j),]
    if (nrow(subset.data)!=0) {
      p <- ggplot(data=subset.data, aes(Time, Value)) +
        geom_line(aes(color=Aspect)) +
        labs(title=paste('Station:',i, " Date:",j)) +
        theme_bw()
      invisible(print(p))
    }
  }
  dev.off()
}

You will note that I'm using an if statement to check if the intersection of the station and the date chosen actually has any rows, and if not, no plot is drawn. This is how I'm able to print one page for each date when there are different (or same) dates per station ID. Additionally, note the use of invisible(), which is a nifty function useful for this purpose, where the results of print() don't actually show up in the console, but are still able to be sent to the pdf() graphics device.

If you want additional useability, you can specify a width= and height= for each plot within the print() function. You can also, of course, adjust the aesthetics of the particular plots how you wish. Example of one of the plots below:

enter image description here

chemdork123
  • 12,369
  • 2
  • 16
  • 32
  • when I did the function on my data, I am getting 5 pdf files and 43 pages. that is perfect. @chemdork123 now each page does not have the plot that I want and the x-axis is a mess (I have 24 hours divided into 5 minutes which means 288 observation). Also, when running the code, I am getting the message below: (geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?) – mustafa May 12 '20 at 12:37
  • I tried to solve the problem by adding (Group=1) to the aesthetics. It plotted the graphs but they are so messy. – mustafa May 12 '20 at 13:20
  • sounds like your x-axis variable is not being formatted as a time variable and has "date" along with it – chemdork123 May 12 '20 at 13:30
  • how can I share the figure I am getting with you? that might help more. – mustafa May 13 '20 at 08:01
  • seems to me that the problems you are seeing with the axis are best suited to be a separate question on SO - where I'd be happy to help again, of course. If the above answer solves the question of saving into pdfs/pages as you specified, you can accept it, but then pose your plot question separately with the code you used to generate it, representative piece of your data (use `dput(your.data.frame)` to paste the dataset as text in the question), and also an image of your current plot. – chemdork123 May 13 '20 at 14:09