1

I am trying to import many csv files from an EPA website. The nomenclature of those csv files is sensible / consistent. Any suggestions on how I can use a loop to automate the importation of the csv files and their naming as dataframes within R?

Right now I'm doing it manually by swapping out the month name in each line of code as illustrated below:


library(tidyverse)

#Download 2013 data

jan_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_jan2013.csv")%>%
  add_column("month"="jan","year"=2013)
feb_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_feb2013.csv")%>%
  add_column("month"="feb","year"=2013)
mar_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_mar2013.csv")%>%
  add_column("month"="mar","year"=2013)
apr_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_apr2013.csv")%>%
  add_column("month"="apr","year"=2013)
may_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_may2013.csv")%>%
  add_column("month"="may","year"=2013)
jun_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_june2013.csv")%>%
  add_column("month"="jun","year"=2013)
jul_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_july2013.csv")%>%
  add_column("month"="jul","year"=2013)
aug_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_aug2013.csv")%>%
  add_column("month"="aug","year"=2013)
sep_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_sept2013.csv")%>%
  add_column("month"="sep","year"=2013)
oct_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_oct2013.csv")%>%
  add_column("month"="oct","year"=2013)
nov_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_nov2013.csv")%>%
  add_column("month"="nov","year"=2013)
dec_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_dec2013.csv")%>%
  add_column("month"="dec","year"=2013)

I'd like to set something up where all 12 months are imported, the added column is modified appropriately and the resulting df is named appropriately, by month.

Thanks for the help!

ocwa80
  • 11
  • 2
  • See https://stackoverflow.com/a/24376207/3358227 for "list of tables". Namely, `res <- lapply(setNames(URLs, basename(URLs)), read.csv)` will get you a list of tables, from there you can (and likely should_) keep them in a list and work on them using `lapply`, since what you do to one will likely be done to all. – r2evans Sep 21 '22 at 20:35
  • An alternative, once you have read them all in, is to use `dplyr::bind_rows(res, idcol="filename")`, and you can do all of your work grouped (by filename) and/or add your `year` and `month` columns based on the filename (simple patterns should work). – r2evans Sep 21 '22 at 20:38

1 Answers1

0

Read all of the csvs using a vector of months and string concatenation, then set their names, enframe, add a year column, and unnest:

months <- c("jan", "feb", "mar", "apr", "may", "june", "july", "aug", "sept", "oct", "nov", "dec")

dfs <- lapply(months, function(x) read.csv(paste0("https://www.epa.gov/sites/default/files/2017-10/rindata_", x, "2013.csv"))) %>%
  setNames(months) %>%
  enframe(name = "month") %>%
  add_column(year = 2013) %>%
  unnest(value)

Let me know if this works!

dcsuka
  • 2,922
  • 3
  • 6
  • 27