0

I have a number of excel files (.xlsx) that store athlete VO2max testing data that I want to analyse as a group. Each file represents 1 client and I can get all the files into a list in R with the following code:

files <- list.files(path = "C:/Users/.../vo2 tests", pattern = "*.xlsx", full.names = T)

I then want to be able to find the maximum value from a specific column of a specific sheet in the excel files (this is the clients VO2max). So far the following code works to find the value I need but only returns a result for 1 of the files, however I want the function to run and collect from each of the excel files and output into a new data frame:

tbl <- lapply(files, read_excel, sheet = "Averages", skip = 2)

max(tbl[[1]][["ml/Kg/min"]])

Once I have all of the VO2max values into a new data frame I want to take the average & standard deviation of this new data set for reporting. The aim being I can provide my clients with a constantly updating ranking system of their performance compared to other athletes. Any help here would be much appreciated.

Nick
  • 3
  • 1
  • 2
    [Opening all files in a folder, and applying a function](https://stackoverflow.com/questions/9564489/opening-all-files-in-a-folder-and-applying-a-function); `lapply(, )` – Henrik Jul 13 '20 at 10:03
  • [Read all files in directory and apply multiple functions to each data frame](https://stackoverflow.com/questions/14958516/read-all-files-in-directory-and-apply-multiple-functions-to-each-data-frame) – Henrik Jul 13 '20 at 10:16

1 Answers1

0

After reading the file, you can take max of the column in each file.

all_maxs <- sapply(tbl, function(x) max(x[["ml/Kg/min"]], na.rm = TRUE))

You can then take mean or sd of all_maxs.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213