0

Context: I have a dataset of 50+ features, and I would like to produce a boxplot, histogram, and summary statistic for each of them, for presentation purposes. That makes 150+ plots. The code I have used to do the above mentioned is as such:

library(ggplot2)
library(dplyr)
library(ggpubr)
library(ggthemes)
library(Rmisc)
library(gridExtra)    

myplots <- list()  # new empty list

for (i in seq(2,5,3)){
  local({
    i <- i
    p1 <- ggplot(data=dataset,aes(x=dataset[ ,i], colour=label))+ 
      geom_histogram(alpha=.01, position="identity",bins = 33, fill = "white") +
      xlab(colnames(dataset)[ i]) + scale_y_log10()  + theme_few()
    p2<- ggplot(data=dataset, aes( x=label, y=dataset[ ,i], colour=label)) +
      geom_boxplot()+ylab(colnames(dataset)[ i]) +theme_few()
    p3<- summary(dataset[ ,i])
    print(i)
    print(p1)
    print(p2)
    print(p3)
    myplots[[i]] <<- p1  # histogram
    myplots[[i+1]] <<- p2 # boxplot
    myplots[[i+2]] <<- p3 # summary
  })
}

myplots[[2]]
length(myplots)

n <- length(myplots)
nCol <- floor(sqrt(n))
do.call("grid.arrange", c(myplots, ncol=nCol)) # PROBLEM: cant print summary as  grob

I have created a list of plots, every 3 elements represent the results of a histogram, boxplot, and summary for each feature. I iterate through each of the 50+ features, appending each of the results to my list (not the best way to go about doing this I know). I then run into the following issue when I attempt to print the list through grid arrange:

Error in gList(list(grobs = list(list(x = 0.5, y = 0.5, width = 1, height = 1,  : 
  only 'grobs' allowed in "gList"

Understandably so, as the summary function does not produce a graphical object. Any ideas as to how I can overcome this setback apart from not including summary statistics at all?

Prradep
  • 5,506
  • 5
  • 43
  • 84
Bharat Desai
  • 123
  • 1
  • 15
  • How large is `dataset`? Can you paste a minimal sample of `dataset`? – hpesoj626 Jun 01 '18 at 11:13
  • @BharatDesai, Have you seen this [post](https://stackoverflow.com/questions/34838870/grid-arrange-from-gridextras-exiting-with-only-grobs-allowed-in-glist-afte/34839064)? – mnm Jun 01 '18 at 12:14
  • Or [skimr](https://github.com/ropenscilabs/skimr)? That gives summary stats and histograms in a nice format (but no boxplots). Also, you can make a table into a grob with `gridExtra::tableGrob`. – aosmith Jun 01 '18 at 13:15
  • Hi @Ashish, unfortunately my issue is a little different. The result of summary() is itself not a grob in the first place that i can append to a groblist as recommended by the answers from that post. – Bharat Desai Jun 04 '18 at 01:35

1 Answers1

1

Hi after combining several of the suggestions here i managed to figure out how to go about plotting the summary statistics per feature as a grob object, after looping through the different features of my dataset.

library(skimr)
library(GridExtra)
library(ggplot2)
library(dplyr)
mysumplots <- list() # new empty list

for (i in seq(2,ncol(dataset))){
  local({
    i <-         
    sampletable <- data.frame(skim((dataset[ ,i]))) #creates a skim data frame 
    summarystats<-select(sampletable, stat, formatted) #select relevant df columns
    summarystats<-slice(summarystats , 4:10) #select relevant stats
    p3<-tableGrob(summarystats, rows=NULL) #converts df into a tableGrob

    mysumplots[[i]] <<- p3 # summary #appends the grob of to a list of summary table grobs
  })
}

do.call("grid.arrange", c(mysumplots, ncol=3)) # use grid arrange to plot all my grobs

What this does is create a skim dataframe of each column (feature), then i selected the relevant statistics, and assigned that grob to the variable p3, which is then iteratively appended to a list of tablegrobs for each feature. I then used gridarrange to print all of the tableGrobs out!

Bharat Desai
  • 123
  • 1
  • 15