14

I have created dozens of data frames in R and would like to append them all to one sheet in an Excel file.

Here are two of the pages I have looked at in an attempt to find an answer (I don't have 10 reputations so I can't paste all four webpage urls I have visited):

Write data to Excel file using R package xlsx The author says: "You can also add the dataframes to a particular starting place in the sheet using the startRow and startCol arguments to the addDataFrame function." Here is the suggested code:

workbook.sheets workbook.test addDataFrame(x = sample.dataframe, sheet = workbook.test,
   row.names = FALSE, startColumn = 4) # write data to sheet starting on line 1, column 4
saveWorkbook(workbook.sheets, "test.excelfile.xlsx") # and of course you need to save it.

Based on this suggestion, this was my attempt in RStudio:

addDataFrame(df_fl1, sheet = "AllData2.xlsx", startRow = 712)

This was R's output: Error in sheet$getWorkbook : $ operator is invalid for atomic vectors

I've also tried this page:

Tutorial on Reading and Importing Excel Files into R "If, however, you want to write the data frame to a file that already exists, you can execute the following command:"

write.xlsx(df, 
           "<name and extension of your existing file>", 
           sheetName="Data Frame"
           append=TRUE)
write.xlsx(df_fl3, "AllData2.xlsx", sheetName="Salinity1", append=TRUE)

I tried this code and it overwrote the data that was already in the sheet. How can I append data from the data frames into an Excel sheet?

eipi10
  • 91,525
  • 24
  • 209
  • 285
Regulus
  • 147
  • 1
  • 1
  • 6

2 Answers2

15

Appending to an existing Excel worksheet is a bit of a pain. Instead, read all of your Excel data files into R, combine them within R, and then write the single combined data frame to a new Excel file (or write to a csv file if you don't need the data to be in an Excel workbook). See code below for both the easy way and the hard way.

Easy Way: Do all the work in R and save a single combined data frame at the end

For example, if all of your Excel data files are in the current working directory and the first worksheet in each Excel file contains the data, you could do the following:

library(xlsx)

# Get file names
file.names = list.files(pattern="xlsx$")

# Read them into a list
df.list = lapply(file.names, read.xlsx, sheetIndex=1, header=TRUE)

Then combine them into a single data frame and write to disk:

df = do.call(rbind, df.list)

write.xlsx(df, "combinedData.xlsx", sheetName="data", row.names=FALSE)

Hard Way: Append successive data frames to a pre-existing Excel worksheet

Create a list of data frames that we want to write to Excel (as discussed above, in your actual use case, you'll read your data files into a list in R). We'll use the built-in iris data frame for illustration here:

df.list = split(iris, iris$Species)

To write each data frame to a single Excel worksheet, first, create an Excel workbook and the worksheet where we want to write the data:

wb = createWorkbook()
sheet = createSheet(wb, "data")

# Add the first data frame
addDataFrame(df.list[[1]], sheet=sheet, row.names=FALSE, startRow=1)

Now append all of the remaining data frames using a loop. Increment startRow each time so that the next data frame is written in the correct location.

startRow = nrow(df.list[[1]]) + 2    

for (i in 2:length(df.list)) {

  addDataFrame(df.list[[i]], sheet=sheet, row.names=FALSE, col.names=FALSE, 
               startRow=startRow)

  startRow = startRow + nrow(df.list[[i]])

  }

Save the workbook:

saveWorkbook(wb, "combinedData.xlsx")

addDataFrame is useful if you want to layout various summary tables in various parts of an Excel worksheet and make it all look nice for presentation. However, if you're just combining raw data into a single data file, I think it's a lot easier to do all the work in R and then just write the combined data frame to an Excel worksheet (or csv file) at the end.

eipi10
  • 91,525
  • 24
  • 209
  • 285
  • 1
    Easy Way: This code appears to be doing something similar to what I did to create the dataframes from the text files: file.names = list.files(pattern="xlsx$") # Read them into a list df.list = lapply(file.names, read.xlsx, sheetIndex=1, header=TRUE) The data text files are in several different folders, so I'd rather use my dataframes instead of moving all those files to one folder. And I am in the working directory with the dataframes, but R does not recognize them: "Error in is.data.frame(x) : object 'dataframe.RData' not found" Hard Way: It worked, but not with my dataframes. – Regulus Jan 07 '16 at 19:51
  • 1
    Hard Way: I have tried df2.list = split("dataframe.RData", "dataframe2.RData") then created the sheet and used addDataFrame() and saveWorkbook. But that only creates a spreadsheet with the dataframe name in the upper left corner of the sheet. I have also tried that with the data's flat files instead of the dataframes. I will also try the addWorkSheet code on this page: [link]http://www.inside-r.org/packages/cran/xlconnect/docs/appendWorksheet[/link] – Regulus Jan 07 '16 at 20:16
  • 1
    For the "easy way": From any given directory, you can get all the files in all subdirectories, including the full path by doing, for example, `list.files(pattern="xlsx", full.names=TRUE, recursive=TRUE)`. For the hard way, the `split` function was just to create the example list of data frames for illustration. With your real data, you would read the files in the same way whether you do it the easy way or the hard way. – eipi10 Jan 07 '16 at 20:52
  • 1
    Also, your code for the split function doesn't seem correct. The first argument should be a data frame (i.e., already loaded into R as an object in your global environment) and the second argument should be a categorical column from that data frame. – eipi10 Jan 07 '16 at 20:53
  • 1
    @eipi10 In your hard-way approach you used a for-loop to add each data frame. Is there a method in which I can replace the for loop with lapply? – Ali Nov 17 '19 at 16:13
1

To get around the original error that you mentioned:

Error in sheet$getWorkbook : $ operator is invalid for atomic vectors

You can try this:

wb <- loadWorkbook("<name and extension of your existing file>")
addDataFrame(df,getSheets(wb)$<sheetname>, startRow = 712)
saveWorkbook(wb, <filename>)