4

In my Shiny data quality assessment application, users upload datasets and can generate reports using RMarkdown. One of those reports is a visual summary of the dataset, rating each variable of the dataset on three metrics: % Compliance (how well the data match the data model), % Logic (is death date after birth date?), and % Completeness. Each patient in the dataset belongs to a GROUP, and the metrics are reported by GROUP. Each dataset includes multiple data tables, and so a separate grid of heatmaps (on a separate page) is created for each data table. Some users have lots of GROUPs in their data, while others have only a few. Similarly, the number of variables in a table vary as well.

How can I standardize the look of these faceted heatmaps across these different circumstances? I am looking for a way to specify the height and width in cm for each heatmap cell, but haven't found a good solution. The two plots below illustrate the problem.
1. Faceted Heatmap with small number of patient groups in dataset (cells are too big!): Small number of groups resulting in huge cells

  1. Faceted Heatmap with large number of patient groups in dataset: Large number of groups and variables resulting in small cells

Using coord_fixed() doesn't correct the issue nor does coord_equal(). It still results in large cells for the first scenario and even tinier ones for the second.

Below is a reproducible example to generate fake data and the resulting faceted heatmaps for a variety of numbers of variables and patient groups. (e.g., makeWrappedHeatmaps(numGroups = 20, numVars = 5) will create faceted set of 20 heatmaps summarizing metrics for 5 variables)

library(tidyverse)
library(data.table)

createFakeVariableMetrics <- function(varName, numGroups){
  tibble(GROUP = paste("Group", LETTERS[1:numGroups]), 
         Variable = rep(varName, numGroups),
         Compliant = round(runif(numGroups, 20, 100), 0), 
         Logic = round(runif(numGroups, 0, 100), 0), 
         Complete = round(runif(numGroups, 25, 100), 0) 
         )
}

# example possible variable names
variableNames <- c("PATIENT", "BIRTH_D", "MED_CODE", "DROP_Y", "DX_D", "VIS_D", "DX_CODE", "RS_CODE", "LAB_V")

makeWrappedHeatmaps <- function(numGroups, numVars){
  if (numVars > length(variableNames)) return("Specify a number of variables less than 10")
  varNames <- variableNames[1:numVars]
  dfList <- lapply(varNames, createFakeVariableMetrics, numGroups)
  metricdf <- rbindlist(dfList)
  forHeatmap <- gather(metricdf, c("Compliant", "Logic", "Complete"), key = "Metric", value = "percent")

  ggplot(data = forHeatmap, aes(x = Metric, y = Variable)) +
    geom_tile(aes(fill = percent), colour = "white", size = 0.01) +
    facet_wrap("GROUP") +
    labs(x="",y="") +
    geom_text(aes(label = percent), size = 3) +
    scale_y_discrete(expand=c(0,0))+
    scale_x_discrete(expand=c(0,0), position = "top") +
    theme(
      axis.text.x = element_text(size = 8),
      plot.background = element_blank(),
      axis.ticks.y = element_blank(),
      axis.ticks.x = element_blank(),
      panel.border = element_blank(),
      legend.position = "none")
}

Thank you!

astrofunkswag
  • 2,608
  • 12
  • 25
Judy
  • 102
  • 1
  • 5
  • Have you checked out the ```ComplexHeatmap``` package? Go through the package reference https://jokergoo.github.io/ComplexHeatmap-reference/book/other-tricks.html, it addresses the exact issue you're facing along with additional details such as how to set a specific size for all heat maps irrespective of size of dataset. The package is also highly customisable when compared to creating heat maps in ```ggplot2```. – Anurag N. Sharma May 05 '20 at 06:56
  • Some questions: how do you set your device / viewport size in shiny?You could set the plot dimensions relative to its size. Or create a dynamic device size relative to the number of plotted dimensions. Would resulting white space be OK? – tjebo May 05 '20 at 07:37
  • @Tjebo, the heatmaps are not viewed interactively but are part of a report generated using downloadHandler/RMarkdown... The reports can be saved as PDF or HTML files. – Judy May 05 '20 at 19:59
  • I guess this means you've got flexible device size? In other words, the figure dimensions can be different, depending on how many tiles you have? Or do you always need the same dimensions, say, e.g. 4x5 inch, and you would accept a bit of whitespace if the tiles are only few? – tjebo May 06 '20 at 08:12

0 Answers0