0

I have a list colName that contains column names of a Dataframe isNumValLong. Each element of this list is a character vector containing 6 column names.

column names in a list

isNumValLong is the dataframe that contains values for the column names passed to the function plothist():

Struct of the dataframe

I intend to pass colName to a plothist which will create multiple plots in set of 6 variables in each call.

plothist <- function(colName, myDf) {
  #str(name)
  #class(name)
  l <- length(colName)
  ls <- list()
  for (i in 1:l) {
    ls[[i]] <- isNumValLong %>% filter(key %in% colName[i]) %>%
      ggplot(aes(x = value)) + geom_histogram() + facet_grid(. ~ key, scales = "free")

  }
  grid.arrange(ls)
}

my function call is below:

lapply(colName, plothist, isNumValLong)

The function returns error

enter image description here

Please let me know where I am wrong?

@MrFlik, here is a reproducible code:

library(ggplot2)
library(tidyr)
library(gridExtra)

plothist <- function(name, myDf) {
  #str(name)
  #class(name)
  l <- length(name)
  ls <- list()
  for (i in 1:l) {
    ls[[i]] <- myDf %>% filter(key %in% name[i]) %>%
      ggplot(aes(x = value)) + geom_histogram() + ggtitle(label = as.character(name[i]))

  }
  grid.arrange(ls)
}


u <- rnorm(10^6)
v <- rnorm(10^6)
w <- rnorm(10^6)
x <- rnorm(10^6)
y <- rnorm(10^6)
z <- rnorm(10^6)

isNumVal <- data.frame(cbind(u, v, w, x, y, z))
isNumValLong <- gather(isNumVal, key = "key", value = "value")
colName <- as.character(unique(isNumValLong$key))
colName <- split(colName, ceiling(seq_along(colName)/2))

lapply(colName, plothist, isNumValLong)

expected output should be 3 set of plots each having two subplots. example 1st set looks like following:

example 1st set:

SeGa
  • 9,454
  • 3
  • 31
  • 70
  • Have you tried running `debug(plothist)` and then calling `plothist(colName[[1]], isNumValLong)`? That way you could look into `ls` before the call to `grid.arrange` happens. – LAP Jun 25 '18 at 13:19
  • 1
    When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Pictures of data aren't helpful because we can't copy/paste that into R to test. – MrFlick Jun 25 '18 at 13:44
  • Hi MrFlick. Here is a reproducible code: – Ravi Shankar Hela Jun 25 '18 at 16:11

1 Answers1

1

Maybe this is what you want? I adapted your function a tiny bit, so you only have to pass a data.frame and not the unique names and you can pass the number of rows and columns to be used for the plots.

library(ggplot2)
library(tidyr)
library(dplyr)
library(gridExtra)

u <- rnorm(10^3); v <- rnorm(10^3)
w <- rnorm(10^3); x <- rnorm(10^3)
y <- rnorm(10^3); z <- rnorm(10^3)

isNumVal <- data.frame(cbind(u, v, w, x, y, z))
isNumValLong <- gather(isNumVal, key = "key", value = "value")

plothist <- function(myDf, ncols=3, nrows=2) {
  uniqueKey <- unique(myDf$key)
  ls <- list()
  for (i in 1:length(uniqueKey)) {
    myDfPlot =  myDf %>% filter(myDf$key %in% uniqueKey[i])
    ls[[i]] = ggplot(myDfPlot, aes(value)) +  
      geom_histogram() + 
      ggtitle(label = as.character(paste("Key:", as.character(uniqueKey[i]),
                                         "\t Mean:", as.character(round(mean(myDfPlot$value, na.rm = TRUE),4)),
                                         "\t Median:", as.character(round(median(myDfPlot$value, na.rm = TRUE),4))   
      ), sep = ", "))

  }
  marrangeGrob(ls, ncol = ncols, nrow = nrows)
}

plothist(isNumValLong)
plothist(isNumValLong,2,1)
SeGa
  • 9,454
  • 3
  • 31
  • 70
  • Thank you! This works.My actual data has 35 numeric variables, so this code would throw all plot at one window and make interpretation difficult. I have picked up "marrange" which is working fine with my original code where i pass 2 at a time. – Ravi Shankar Hela Jun 25 '18 at 17:27
  • 1
    Edited the function. You can now pass the number of rows and columns for the plots. And bug is gone, so it will also work in a clean environment. ;) – SeGa Jun 25 '18 at 18:33
  • great! what will make it perfect is printing mean and median of values in ggtitle.. ggtitle(label = paste(as.character(name[i]),as.character(mean(value, na.rm = TRUE)), sep = ", ")) I get an error. Error in mean(value) : object 'value' not found . how do i get this right? – Ravi Shankar Hela Jun 25 '18 at 18:49
  • Changed the function. I filter first for the data and then assign only the plots to the list. I think its loosing the reference to the columns otherwise. – SeGa Jun 25 '18 at 19:15