4

I often write papers with correlation matrices. I would like to be able to export the correlation matrix to Excel in xls or xlsx format. I would also like to format in bold correlations that meet a threshold (e.g., > .2). I was thinking maybe XLConnect might provide the functionality.

In order to make the example simple, assume the data frame is as follows, and assume I want to bold all cells greater than 5.

x <- data.frame(matrix(1:9, nrow = 3))
# > x
#   X1 X2 X3
# 1  1  4  7
# 2  2  5  8
# 3  3  6  9

As an aside, I note that solutions have been proposed for cellbolding for markdown:

I've also found this answer, but it is not a very general solution in that it takes quite a bit to adapt it to the general task of taking a data frame and a formatting rule:

export data frames to Excel via xlsx with conditional formatting

Community
  • 1
  • 1
Jeromy Anglim
  • 33,939
  • 30
  • 115
  • 173
  • 2
    Please show any attempts you have made. I barely work with spreadsheets, and I solved this years ago. The [xlsx package](https://cloud.r-project.org/web/packages/xlsx/index.html) has (had?) examples doing just that. – Dirk Eddelbuettel Aug 01 '16 at 01:21
  • @DirkEddelbuettel I haven't really started yet. But if I come up with a solution, I'll post an answer as I often do. My starting assumption was there is no easy answer when you google the question, and it's a common use case, so it would be good if such a resource existed. – Jeromy Anglim Aug 01 '16 at 01:29
  • 1
    The resource exist, and is documented within the [xlsx](https://cloud.r-project.org/web/packages/xlsx/index.html) package. I suggest you close this question. It only adds noise and zero signal. – Dirk Eddelbuettel Aug 01 '16 at 01:30
  • 2
    @DirkEddelbuettel I disagree. The purpose of StackOverflow is to create resources on the internet that speed up the process of coding. If you google a clear question, and the answer does not come up, then the resource is not that accessible. If the solution is documented, then quote the solution. I appreciate the tip on xlsx, but it is not trivial to go from that documentation to a solution. There are also many excel packages for R. So that add anotehr barrier between use case and solution. I'll work through it and post an answer – Jeromy Anglim Aug 01 '16 at 01:39
  • 1
    Please explain how adding dozens (hundred, thousands, ...) of content-free posts makes searching easier. You. Are. Making. It. Worse. End of thread for me, there is no gain here as you are surely convinced of your merry way. Let's agree to disagree here. – Dirk Eddelbuettel Aug 01 '16 at 01:41
  • http://www.rdocumentation.org/packages/xlsx/versions/0.5.7/topics/CellStyle – Dirk Eddelbuettel Aug 01 '16 at 01:43
  • @dirk Thanks, that looks like a good starting point. – Jeromy Anglim Aug 01 '16 at 01:49
  • You can adapt the linked answer like this, no? `sheetname <- "mysheet"; write.xlsx(x, "mydata.xlsx", sheetName=sheetname); file <- "mydata.xlsx"; wb <- loadWorkbook(file); cs <- CellStyle(wb) + Font(wb, isBold = TRUE); sheets <- getSheets(wb); sheet <- sheets[[sheetname]]; rows <- getRows(sheet, rowIndex=2L:(nrow(x) + 1L)); cells <- getCells(rows, colIndex = 2L:(ncol(x) + 1L)); lapply(which(t(x) > 5), function(ii) setCellStyle(cells[[ii]], cs)); saveWorkbook(wb, file)`. I'm sure you can do it without creating .xlsx, reading it into R, applying formatting, and resaving. – Jota Aug 01 '16 at 02:38
  • @jota Thanks I've adapted your answer. I hope you don't mind. Thanks heaps. – Jeromy Anglim Aug 01 '16 at 03:00

1 Answers1

4

I created the following function that was adapted from @jota's answer here

xlsx_boldcells <- function(x, matches, file = "test.xlsx", sheetname = "sheet1") {
    # x data.frame or matrix
    # matches: logical data.frame or matrix of the same size indicating which cells to bold
    # copy data frame to work book and load workbook
    require(xlsx)
    write.xlsx(x, file, sheetName=sheetname)
    wb <- loadWorkbook(file)              

    # specify conditional formatting
    # Note: this could be modified to apply different formatting
    # see ?CellStyle
    fo <- Font(wb, isBold = TRUE)  
    cs <- CellStyle(wb, font=fo)  

    # Get cell references
    sheets <- getSheets(wb)               # get all sheets
    sheet <- sheets[[sheetname]]          # get specific sheet
    rows <- getRows(sheet, rowIndex=2:(nrow(x)+1))  # get rows
    cells <- getCells(rows, colIndex = 2:(ncol(x)+1))  

    # Matches to indexes
    indm <- data.frame(which(matches, arr.ind = TRUE, useNames = FALSE)) 
    names(indm) <- c("row", "col")
    # +1 required because row and column names occupy first rows and columns
    indm$index <- paste(indm$row + 1, indm$col + 1, sep = ".")

    # apply cell style
    lapply(indm$index, function(ii) setCellStyle(cells[[ii]],cs))

    # save workbook
    saveWorkbook(wb, file)
}

Thus, it can be applied to proposed problem:

 xlsx_boldcells(x, x > 5)

yielding:

excel sheet with bold

Or it could be applied to the common correlation problem (i.e., bolding large correlations, e.g., greater than .6) as follows:

data(mtcars)
cors <- round(cor(mtcars), 2)
xlsx_boldcells(cors, abs(cors) > .6 & cors!=1, file = "cormatrix.xlsx")

bold cell formatted correlations using xlsx

Community
  • 1
  • 1
Jeromy Anglim
  • 33,939
  • 30
  • 115
  • 173