-1

I have some code which uses a loop to calculate a water balance for catchments (watersheds) for 8 catchments. I would like the loop to write it's output to a dataframe in R but the only way I can work out how to do this is to write it to csv (inside the loop), then outside of the loop read each of the csv files separately. I feel as though there could be a better way to do this - any ideas?

This is my code: (note it is part of a shiny app, hence my desire to avoid reading and writing csv files)

WB_catchments <- function (){  
      for (i in 1:8){
        file_name <- gsub(" ", "", paste("outputs\\", Lake_name[i], "_catchment_water_balance.csv"))
        p <- pts()[[i]]
        
        Rain_in_WB <- RAIN() %>% filter(Grid_id %in% p)
        Rain_in_WB$Grid_id <- NULL  #remove Grid_id column
        Rain_in_WB <- colSums(Rain_in_WB, na.rm = TRUE)  # sum over catchment
        
        AET_out_WB <- AET() %>% filter(Grid_id %in% p)
        AET_out_WB$Grid_id <- NULL #remove Grid_id column
        AET_out_WB <- -1*colSums(AET_out_WB, na.rm = TRUE) # sum over catchment and multiply by -1 as is an output
        
        Evap_WB <- -1*EVAP_lakes[i,]
        
        SW_in_WB <- SW_in_C[i,]
        GW_in_WB <- GW_in_C[i,]
        
        SW_out_WB <- -1*SW_out_C[i,]
        GW_out_WB <- -1*GW_out_C[i,]
        
        stor_WB <- STOR[i,]
        
        out_catchment <- -1*outside[i,]
        
        bal <- as.data.frame(cbind(WY, Rain_in_WB, SW_in_WB, GW_in_WB, AET_out_WB, Evap_WB, SW_out_WB, GW_out_WB, stor_WB, out_catchment))
        bal <- mutate(bal, "res" = rowSums(bal[,2:10], na.rm = TRUE))     
        colnames(bal) <- c("WaterYear", "Rain", "SW_in", "GW_in", "AET", "Evap", "SW_out", "GW_out", "Storage", "Water_out_of_Greater_Tarawera_Catchments", "Residual")
        
        
        
        write.csv(bal, file_name)
      }
    }

WB_catchments()
    Okareka_WB_C <- read.csv("outputs\\Okareka_catchment_water_balance.csv")
    Okaro_WB_C <- read.csv("outputs\\Okaro_catchment_water_balance.csv")
    Okataina_WB_C <- read.csv("outputs\\Okataina_catchment_water_balance.csv")
    Rerewhakaaitu_WB_C <- read.csv("outputs\\Rerewhakaaitu_catchment_water_balance.csv")
    Rotokakahi_WB_C <- read.csv("outputs\\Rotokakahi_catchment_water_balance.csv")
    Rotomahana_WB_C <- read.csv("outputs\\Rotomahana_catchment_water_balance.csv")
    Tarawera_WB_C <- read.csv("outputs\\Tarawera_catchment_water_balance.csv")
    Tikitapu_WB_C <- read.csv("outputs\\Tikitapu_catchment_water_balance.csv")
Nicki_NZ
  • 123
  • 8
  • 3
    Please provide us with a small, reproducible code snippet that we can copy and paste to better understand the issue and test possible solutions. You can share datasets with `dput(YOUR_DATASET)` or smaller samples with `dput(head(YOUR_DATASET))`. (See [this answer](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#5963610) for detailed instructions.) – ktiu Jun 08 '21 at 00:00
  • It's not clear to me when you are calling this function. Is this inside your shiny application? Rather than wring to separate CVS files you can write to a `rds` file with `save()` which will make future loading in R faster. Or if this in your shiny app, your function can just return a list of data.frames directly rather than writing back to disk at all. It's just not very clear exactly what you need to do. – MrFlick Jun 08 '21 at 00:10

2 Answers2

1

Instead of posting some very special code snippets, it is in most cases to post a toy example. Here an artificial example how to fill a data frame in a loop. As R is a vectorized language, it is often to avoid a loop at all. Compare the two cases below:

## number of cases
N <- 10

### looped version =====
df <- data.frame(
  rain=rep(0, N),
  evap=rep(0, N)
)

for (i in 1:N) {
  # instead of runif, do your calculations
  # ...
  rain <- runif(1, min=0, max=10)
  evap <- runif(1, min=1, max=5)
  df[i, ] <- c(rain, evap)
}

df

### vectorized version =====
rain <- runif(N, min=0, max=10)
evap <- runif(N, min=1, max=5)

df2 <- data.frame(
  rain=rain,
  evap=evap
)

df2

If your calculations return more than one row in each iteration and you don't know beforehand how many, grow the data frame like this:

## empty data frame
df3 <- data.frame(
  rain=NULL,
  evap=NULL
)

for (i in 1:N) {
  # instead of runif, do your calculations
  # ...
  rain <- runif(7, min=0, max=10)
  evap <- runif(7, min=1, max=5)
  df3 <- rbind(df3, cbind(rain, evap))
}

df3

Edit: Create several data frames (as elements of a list)

If separate data frames are needed, it is a good idea to put them together in a list. INstead of a loop,l we can use lapply:

create_df <- function(i) {
  # optionally: do something with i, e.g. select file name
  rain <- runif(7, min=0, max=10)
  evap <- runif(7, min=1, max=5)
  df <- data.frame(
    rain=rain,
    evap=evap
  )
}

## lapply does the "loop" and returns a list of data frames
df_list <- lapply(1:8, create_df)

df_list[[7]] # returns 7th data frame
tpetzoldt
  • 5,338
  • 2
  • 12
  • 29
0

Another way I got this working was by using assign(file_name, bal, envir = .GlobalEnv) instead of write.csv(bal, file_name) in the last line of my function

Nicki_NZ
  • 123
  • 8