0

I currently have a folder with 151 files in it. Within a loop I want to read in each file, carry out a function on each file and then combine the function returns for each file into a single csv with heading names in R studio

I am currently using a loop within a loop. The first loop reads in all of my files which are linked to the directory saved as the object FilestoCalc. This seems to be working with the head/tail function giving me the first and last files.

For my second loop I want to carry out a function I have created "Massofgas" on each of my files. This function carries out calculations on rows and finishes with a linear model across the whole dataset in the file. The function returns some metadata e.g. site name and and test statistics such as the Root Mean Squared Error.

When I run this code it seems to run infinitely and only provide the return for one file. I would like to amend this so the return for each file becomes a row on a single output file. I would also like to name each column for interpretation later.

for(i in nFiles) {

  Jan22 <- read.csv(FilesToCalc[i], sep = ",", skip = 0) 

head(Jan22)
tail(Jan22)

  n <- NULL
  Collvol <- Jan22$CollarHeight
  for(j in nFiles) {
    CH4CO2Mass <- Massofgas(collvol)
    
    Fluxes<-rbind(CH4CO2Mass)
?rbind()
    
    filename <- paste0(OutPutFiles,"//", "Fluxes.csv")
    
    write.csv(Fluxes, file = filename, col.names = c("Site", "GLA", "Airtemp", "Soiltemp", "Soilmoist1", "Soilmoist2", "meanmoist", "SoilEc1", "SoilEC2", "MeansoilEC", "L1", "L2", "L3", "L4", "L5", "L6", "Dipwell", "Start_Time", "meanCO2", "meanCH4", "CO2_Slope", "CO2_intercept", "CH4_Slope", "CH4_intercept", "R2_CO2", "R2_CH4", "RMSE_CO2", "RMSE_CH4", "P_CO2", "P_CH4"))
  }
}

Solution

To solve this problem when creating the function I ensured the outputs were in a data frame format and then returned the data frame from the function, I also named the headers of the data frame within the function.

When looping the function for each of the files I created a new data frame first then used rbind() in the loop to create the output file with a row for every file.

#### Set directories ####
FilesToCalc <- list.files("#...Directory..../FileToProcess", pattern = "csv",full.names=T,)

#### Create a function ####
Function_name<-function(Files_to_Read){
#function here e.g.
lmfile<-(y~x)

DF1<-data.frame(linearmodel=anova(lmCO2)$'Pr'[1], x1, y1, z1....)
return(DF1)}

#### Loop the function ####

Output <- data.frame()
for(i in 1:nfiles){
Files_to_Read<-read.csv(FilesToCalc[i], sep = ",", skip = 0)
Results <- Function_name(Files_to_read)
Output<-rbind(Output,Results)
}

write.csv(#...Directory...#, Output)

Katros
  • 1
  • 3
  • Can you make your example [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? – jrcalabrese Feb 15 '23 at 18:33
  • @jrcalabrese sorted - When I was creating the function rather than the vector output I originally had I created an object with a data frame of my outputs and then returned this data frame. I could then shorten my loop and use rbind() to add each row. `DF1<-data.frame(x,y,z) return(DF1) Fluxes <- data.frame() for(i in 1:149) { Jan22 <- read.csv(FilesToCalc[i], sep = ",", skip = 0) CH4CO2Mass <- Massofgas(Jan22) Fluxes<-rbind(Fluxes,CH4CO2Mass) } Fluxes` Some files I was reading were also empty causing errors later – Katros Feb 15 '23 at 19:04
  • Instead of commenting, can you edit/update your post so your original question is reproducible for other StackOverflow users? – jrcalabrese Feb 15 '23 at 19:11

0 Answers0