Looping through a list of data frames in R

Question

I'm having trouble looping through a list of data frames. I'll attach my full code below with my file pathways redacted and then explain specifics.

# load packages ----
xlib <- c("rnoaa","devtools","dplyr","plyr","hydroTSM","chron","gdata","date", "rowr")
lapply(xlib, require, character.only = T)
rm(xlib)

# se token ----
# get this from NOAA site - https://www.ncdc.noaa.gov/cdo-web/token
options(noaakey = "QlzFUrMWVrLFKkZijFohYRmbtVvEaqUB") 

# set working folder ----
setwd("filepathway")

# read weather station information
statns <- read.csv(file = "filepathway/weather_station_locations_v1.csv", header = T)

# define dates ----
st.date <- as.Date("1950-01-01")
end.date <- as.Date("2018-04-30")
date1 <- data.frame(date = dip(st.date, end.date))
date1$year <- as.numeric(format(date1$date, "%Y"))
date1$month <- as.numeric(format(date1$date, "%m"))
date1$day <- as.numeric(format(date1$date, "%d"))
date1$jday <- as.numeric(format(date1$date, "%j"))

n1 <- 1
#n1 <- 100
for(id1 in (1:n1)) {

# pull data from NOAA server ----

# 1. precip
prcp.pull <- meteo_pull_monitors(monitors = statns$STATION[ id1 ],
                                 keep_flags = T,
                                 date_min = st.date,
                                 date_max = end.date,
                                 var = "PRCP")
prcp.pull$prcp <- prcp.pull$prcp / 10 # convert to mm/day

# 2. max. temperature
tmax.pull <- meteo_pull_monitors(monitors = statns$STATION[ id1 ],
                                 keep_flags = T,
                                 date_min = st.date,
                                 date_max = end.date,
                                 var = "TMAX")
tmax.pull$tmax <- tmax.pull$tmax / 10 # convert to dec. C

#3. min. temperature
tmin.pull <- meteo_pull_monitors(monitors = statns$STATION[ id1 ],
                                 keep_flags = T,
                                 date_min = st.date,
                                 date_max = end.date,
                                 var = "TMIN")
tmin.pull$tmin <- tmin.pull$tmin / 10 # convert to dec. C

statns2 <- split(statns, statns$"STATION")







colnames(prcp.pull)[1] <- "STATION"
colnames(tmin.pull)[1] <- "STATION"
colnames(tmax.pull)[1] <- "STATION"

prcpA <- rbind.fill(statns2, prcp.pull)
tminA <- rbind.fill(statns2, tmin.pull)
tmaxA <- rbind.fill(statns2, tmax.pull)
prcpB <- cbind.fill(statns2, prcpA)
tminB <- cbind.fill(statns2, tminA)
tmaxB <- cbind.fill(statns2, tmaxA)
tminC <- merge(tminB, statns2, by.x = 2, by.y = 2)
tmaxC <- merge(tmaxB, statns2, by.x = 2, by.y = 2)
prcpC <- merge(prcpB, statns2, by.x = 2, by.y = 2)
colnames(tminC)[2] <- "A"
colnames(tminC)[3] <- "B"
colnames(tminC)[4] <- "C"
colnames(tminC)[5] <- "D"
colnames(tminC)[6] <- "E"
colnames(tminC)[7] <- "G"
tminD = subset(tminC, select = -c(A, B, C, D, E, G ))
colnames(tmaxC)[2] <- "A"
colnames(tmaxC)[3] <- "B"
colnames(tmaxC)[4] <- "C"
colnames(tmaxC)[5] <- "D"
colnames(tmaxC)[6] <- "E"
colnames(tmaxC)[7] <- "G"
tmaxD = subset(tmaxC, select = -c(A, B, C, D, E, G ))
colnames(prcpC)[2] <- "A"
colnames(prcpC)[3] <- "B"
colnames(prcpC)[4] <- "C"
colnames(prcpC)[5] <- "D"
colnames(prcpC)[6] <- "E"
colnames(prcpC)[7] <- "G"
prcpD = subset(prcpC, select = -c(A, B, C, D, E, G ))








# save output as text file
fout <- paste("filepathway", c("prcpD", "tmaxD", "tminD"), statns$STATION[ id1 ],
              ".csv", sep = "")

# 1. precip
if (nrow(prcpD) > 0) {
  write.table(file = fout[1], x = prcpD, col.names = T, row.names = T, append = F, sep = ",", quote = F)
}

# 2. tmax
if (nrow(tmaxD) > 0) {
  write.table(file = fout[2], x = tmaxD, col.names = T, row.names = T, append = F, sep = ",", quote = F)
}

# 3. tmin
if (nrow(tminD) > 0) {
  write.table(file = fout[3], x = tminD, col.names = T, row.names = T, append = F, sep = ",", quote = F)
}
}
}

After this line:statns2 <- split(statns, statns$"STATION") I get a list of data frames and would like to run the loop through each of these individual data frames - as in, when id1 (a number from 1 to 13926) matches FID + 1 (FID starts at 0 and goes to the end of the list), such that the commands after the split are run through the list one at a time making sure to match the info between my precipitation, temperature data, and weather station

Without the split into the list of data frames, it just gives each weather station one row of data whereas I'd like a row identifying the station then one for every date from start to end.

update: I made a small subset of my list and used head(statns.small) and have pasted my results below

$`CA003030525`
  FID     STATION  LAT    LON ELEV             NAME  CODE
1   0 CA003030525 49.8 -112.3  824 AB BARNWELL AGDM 71346

$CA003030720
  FID     STATION     LAT     LON ELEV                NAME  CODE
2   1 CA003030720 49.5667 -113.05  980 AB BLOOD TRIBE AGDM 71517

$CA003030768
  FID     STATION     LAT     LON ELEV          NAME  CODE
3   2 CA003030768 49.7333 -111.45  817 AB BOW ISLAND 71231

What's problem. You should try to provide a small reproducible code and mention problem. Have a look at https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example on how to ask question. — MKR, Jun 04 '18 at 17:59
Consider generating `statns.small <- lapply(split(statns, stans$STATION), head)[1:3]` or something similar, and posting that literal frame here. That seems to be the crux of the problem you're encountering, but the steps to reproduce are somewhat arduous. — user295691, Jun 04 '18 at 18:04
Please do not put `rm(list = ls())` in your stack overflow example code (I edited it out). People trying to answer your question might not want to clear their entire workspace - if someone had just loaded a huge dataset or run a time-intensive function it could set them back a fair amount of time in their own work if they accidentally run that. — Jan Boyer, Jun 04 '18 at 18:09

Looping through a list of data frames in R

0 Answers0