0

I'm fairly new to R and have been struggling with simplifying some code using a for loop. I am attempting to pull water quality data from an online database using the package dataRetrieval. I currently have duplicated the code for each site and changed the site number and output name, but have been trying to simplify this by putting the script in a for loop and am having trouble with creating the separate data tables with unique identifiers.

Original Code that creates a data table for each site. The only variables that changes are the siteNumbers and the data table name "x"_dataTable

#BW00A
siteNumbers = c("383652091125002")
parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

BW00A_dataTable <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)
#BW01
siteNumbers = c("383648091124501")
parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

BW01_dataTable <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)
#BW01A
siteNumbers = c("383648091124502")
parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

BW01A_dataTable <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)

New Code I can't get to work. I've placed the siteNumbers and siteNames into a data frame. What I want is for the script inside the for loop to iterate through the siteNumbers to pull the data and then attribute the newly created data table to the corresponding siteNames aka unique_siteName. I'm not sure if this is even possible.

df <- data.frame(
  siteNumbers = c("383652091125001",    "383652091125002",  "383648091124501",  "383648091124502",  "383506091132201",  "383508091132002",  "383508091132004",  "383519091133701",  "383544091132601",  "383544091132502",  "383628091124801",  "383639091125902",  "383639091125901",  "383638091125001",  "383638091125002",  "383631091124803",  "383631091124804",  "383631091124801",  "383631091124802",  "383636091123801",  "383636091123811",  "383616091125701",  "383640091130701",  "383640091130702",  "383621091130701",  "383621091130703",  "383621091130702",  "383624091130501",  "383624091130502",  "383616091130801",  "383616091130802",  "383644091131601",  "383627091130201",  "383622091130604",  "383622091130605",  "383557091132001",  "383614091132801"),
  siteName = c("BW-00", "BW-00A",   "BW-01",    "BW-01A",   "MW-04",    "MW-04A",   "MW-04B",   "MW-11",    "BW-21",    "BW-21A",   "210TB-C6", "Bates Spring", "Bates Spring below dam",   "BW-02",    "BW-02A",   "BW-04A-D", "BW-04A-S", "BW-04D",   "BW-04S",   "BW-05",    "BW-05A",   "BW-07",    "BW-08",    "BW-08A",   "BW-11",    "BW-11A-D", "BW-11A-S", "BW-13",    "BW-13A",   "BW-14",    "BW-14A",   "BW4-15",   "BW4-16",   "BW4-17",   "BW4-18",   "W3",   "W4")
)

parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

for (row in df)
{
 unique_siteName <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)  
  
}

Thanks for your help!

Werner Hertzog
  • 2,002
  • 3
  • 24
  • 36
Cbuckley
  • 17
  • 4
  • In R it is generally a best practice to save related objects like dataframes in a list. You can assign names to each list object and/or reference the list object by its numerical index. E.g., `mylist[[3]]` refers to the 3rd object in the list `mylist`. Processing all related objects in a list is also facilitated with the lapply function. – SteveM Dec 09 '20 at 16:29

1 Answers1

1

You need to loop over the row index and reference the data frame with row number in the loop, and create a list to accumulate the results:

results <- list()
for (row in 1:nrow(df)) {
 results[[i]] <- readNWISqw(df$siteNumbers[i], parameterCode,
                             startDate, endDate)  
}
names(results) <- df$siteName

R also offers lapply as a way to simplify this common pattern. The above loop is equivalent to this:

results <- lapply(df$siteNumbers, FUN = readNWISqs, parameterCode, startDate, endDate)
names(results) <- df$siteName

I'd suggest reading my answer at How to make a list of data frames? for more discussion and explanation, both for why we do it this way and what good next steps are (for example, combining the results list into a single data frame).

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294