2

I am using the ggpairs from ggplot2.

I need to get an histogram in the diagonal for the ggpairs, but want to superimpose the normal density curve using the mean and sd of the data.

I read the help (https://www.rdocumentation.org/packages/GGally/versions/1.4.0/topics/ggpairs) but can't find an option to do it. I guess I must built my own function (myfunct) and then

ggpairs(sample.dat, diag=list(continuous = myfunct))

Has anyone have tried this?


I have tried the following:

head(data) 
      x1    x2    x3    x4    x5    x6     F1    F2 
1 -0.749 -1.57 0.408 0.961 0.777 0.171 -0.143 0.345 

myhist = function(data){ 
          ggplot(data, aes(x)) + 
             geom_histogram(aes(y = ..density..),colour = "black") + 
             stat_function(fun = dnorm, args = list(mean = mean(x), sd = sd(x))) 
           } 

ggpairs(sample.data, diag=list(continuous = myhist))

The result is:

Error in (function (data) : unused argument (mapping = list(~x1))

user20650
  • 24,654
  • 5
  • 56
  • 91
JPMD
  • 644
  • 1
  • 7
  • 19
  • as you mention, just write your own function. relevant plotting code https://stackoverflow.com/questions/6967664/ggplot2-histogram-with-normal-curve – user20650 Oct 09 '19 at 21:58
  • ... yes, but since I have a data frame, I am not sure how to do it. I have tried: – JPMD Oct 10 '19 at 09:08
  • I have tried the following: ```head(data) x1 x2 x3 x4 x5 x6 F1 F2 1 -0.749 -1.57 0.408 0.961 0.777 0.171 -0.143 0.345 ``` ```myhist = function(data){ ggplot(data, aes(x)) + geom_histogram(aes(y = ..density..),colour = "black") + stat_function(fun = dnorm, args = list(mean = mean(x), sd = sd(x))) }``` ``` ggpairs(sample.data, diag=list(continuous = myhist)) ``` The result is: Error in (function (data) : unused argument (mapping = list(~x1)) – JPMD Oct 10 '19 at 09:38
  • I just don't know how to pass the data into the mean=mean() argument of the stat_function – JPMD Oct 10 '19 at 12:40
  • JPMD; Ive added a quick example. – user20650 Oct 10 '19 at 13:59
  • ... and https://ggobi.github.io/ggally/#custom_functions provides some better docs and examples – user20650 Oct 10 '19 at 14:37

1 Answers1

2

This question provides an example of the code to add a normal curve to a histogram in ggplot2. You can use this to write your own function to pass to the diag argument of ggpairs. To calculate the mean and sd of the data, you can grab the relevant data using, for example, eval_data_col(data, mapping$x). Example below (perhaps a little more complicated than needed but it allows you to pass parameters to change colours etc using the wrap functionality.

library(GGally)    

diag_fun <- function(data, mapping, hist=list(), ...){

    X = eval_data_col(data, mapping$x)
    mn = mean(X)
    s = sd(X)

    ggplot(data, mapping) + 
      do.call(function(...) geom_histogram(aes(y =..density..), ...), hist) +
      stat_function(fun = dnorm, args = list(mean = mn, sd = s), ...)
  }

ggpairs(iris[1:100, 1:4], 
        diag=list(continuous=wrap(diag_fun, hist=list(fill="red", colour="blue"), 
                                  colour="green", lwd=2)))
user20650
  • 24,654
  • 5
  • 56
  • 91
  • Thank you user20650. The `X=eval_data_col` part solved my problem!... – JPMD Oct 11 '19 at 07:58
  • 1
    you're welcome. To get an idea of how to do things it is worth looking at the underlying R code, for example, `GGally::ggally_density` – user20650 Oct 11 '19 at 08:17