0

I have previously used a function similar to this question/answer to read in multiple sheets and create a new id variable based on the sheet name:

Importing and Renaming multiple sheets in a excel worksbook with map_df and read_excel

file_name <- "example.xlsx"
sheets <- readxl::excel_sheets(file_name)
data <- map_df(sheets, ~mutate(read_excel(file_name, sheet = .x), 
                       id = .x))

However, I would now like to introduce additional steps to each sheet before joining sheets into a data frame. I would like to remove a row, transpose the sheet, and as the original column names in the excel files are times, create a new time variable. I can manage this with one sheet e.g.

time <-seq(ymd_hm('2023-04-20 12:00'),ymd_hm('2023-04- 11:50'), 
                 by = '10 mins')

file_name <- "example.xlsx"

data <- readxl::read_excel(file_name)%>%
  dplyr::slice(-3)%>%
  sjmisc::rotate_df(cn = T) %>%
  tibble()%>%
  mutate(time = time)) 

I have not been able to successfully combine the steps, i.e. import, transpose/format sheets, create id variable based on sheet name, and combine all sheets in a data frame. I would appreciate a tidyverse based solution/advise, and apologise for the lack of a reproducible example, as not sure how to do this when the question regards reading in data.

Pete
  • 600
  • 1
  • 6
  • 16

1 Answers1

1

Not sure what's exactly the issue. But you can simply move your data wrangling code inside a function which can be applied to the list of sheets similar to your first code:

Using some fake example data:

library(tidyverse)

file_name <- tempfile(".xlsx")

dat <- data.frame(
  time = 2010:2014,
  x = 1:5,
  y = rev(1:5)
)
writexl::write_xlsx(list(a = dat, b = dat), file_name)

sheets <- readxl::excel_sheets(file_name)

data <- map_df(sheets, function(sheet) {
  readxl::read_excel(file_name, sheet = sheet) %>%
    dplyr::slice(-3) %>%
    #sjmisc::rotate_df(cn = T) %>%
    tibble() %>%
    mutate(time = time, id = sheet)
})

data
#> # A tibble: 8 × 4
#>    time     x     y id   
#>   <dbl> <dbl> <dbl> <chr>
#> 1  2010     1     5 a    
#> 2  2011     2     4 a    
#> 3  2013     4     2 a    
#> 4  2014     5     1 a    
#> 5  2010     1     5 b    
#> 6  2011     2     4 b    
#> 7  2013     4     2 b    
#> 8  2014     5     1 b
stefan
  • 90,330
  • 6
  • 25
  • 51