-1

I'm currently using R studio Version 1.0.143 with R version 3.4.4.

I'm trying to make a apply() function that does the following but to many files:

df1 <- read_excel("Sample 1.xlsx")
df1["Sample"] <- "Sample 1"
df2 <- read_excel("Sample 2.xlsx")
df2["Sample"] <- "Sample 2"
rbind(df1, df2) -> final

A function that adds a column to the dataframe informing to which sample the variables belong, because after I will bind all samples. Not only this function has to read many excel files, but to add a column in each file with the name of the file (as the example). I couldn't find anything similar in other questions

I tried to use lapply, but without success. I need this because I want to merge more than 100 dataframes (I could do by hand but it seems dumb to not use apply). All the files have the same columns, so there is no need for a function that puts NAs when a column is not present in one file.

  • 4
    Reading many files into R is one of the most asked questions on R tagged questions of StackOverflow. – Parfait Jul 05 '18 at 18:14

1 Answers1

1
names <- c("Sample 1", "Sample 2")

out_l <- lapply(names, function(x) {
  tmp_df <- read_excel(paste0(x, ".xlsx"))
  tmp_df[["Sample"]] <- x
  tmp_df
})
out_df <- do.call(rbind.data.frame, out_l)

Or using the shorter version per @Parfait's comment

names <- c("Sample 1", "Sample 2")

out_l <- lapply(names, function(x) {
  transform(read_excel(paste0(x, ".xlsx")), Sample = x)
})
out_df <- do.call(rbind.data.frame, out_l)
MHammer
  • 1,274
  • 7
  • 12
  • 2
    Or use `transform` or `within` for a one liner inside `lapply`: `transform(read_excel(paste0(x, ".xlsx")), Sample = x)` – Parfait Jul 05 '18 at 18:15
  • Indeed. I was trying to stick to a similar format as the OP so as to optimize on familiar ground. – MHammer Jul 05 '18 at 18:16
  • As an fyi, `names` is a vector, not a data frame. – MHammer Jul 05 '18 at 18:40
  • I tried something a little different so I don't have to create the "names" dataframe. `dta_files <- list.files('C:/Users, pattern = 'Sample', full.names = F) out_l <- lapply(dta_files, function(x) { transform(read_excel(paste0(x)), Sample = x) }) out_df <- do.call(rbind.data.frame, out_l)` This gives me a column with each filename (Sample 1.xlxs), which seems good enough – Gabriel A. C. Gama Jul 05 '18 at 18:44
  • Yes, `list.files()` is another useful and common approach. – MHammer Jul 05 '18 at 18:47