2

I would like to use ggplot2::geom_function() to plot functions for a set of given parameters from a data frame. These should then be organized into facets by a "year" parameter. There are solutions to similar problems on StackOverflow (here and here) but I think my use case is slightly more complex.

My input is a data frame. It contains parameters of different normal distributions (mu, sigma, lambda), which I would like to plot, and a centralYear parameter that I would like to use as facets for facet_wrap. In addition to other use cases, I want to plot several normal distributions per facet, which are distinguished by another parameter generation. The number of generations per year might be variable. The data frame looks like this:

input data frame

What I would like to have is an output that looks like this:

output plot

Here's a minimal working example including a test data frame:

# libraries
library(ggplot2) # for plotting
library(dplyr)   # for filtering data and pipes 


# Input data
fantasy_df <- data.frame(
  centralYear = c(2017, 2017, 2017, 2017, 2016, 2016, 2016, 2016, 2015, 2015, 2015, 2015, 2014, 2014, 2014, 2014, 2013, 2013, 2013),
  generation  = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3),
  mu          = c(123.6, 188.7, 234.5, 269.6, 122.6, 188.4, 232.5, 269.6, 117.3, 187.1, 233.2, 271.3, 117.3, 187.3, 232.8, 271.6, 118.4, 193.9, 246.7),
  sigma       = c(14.6, 14.6, 14.6, 14.6, 14.8, 14.8, 14.8, 14.8, 15.1, 15.1, 15.1, 15.1, 15.4, 15.4, 15.4, 15.4, 17.5, 17.5, 17.5),
  lambda      = c(0.06, 0.44, 0.34, 0.15, 0.07, 0.46, 0.30, 0.17, 0.07, 0.46, 0.33, 0.15, 0.08, 0.45, 0.33, 0.14, 0.09, 0.53, 0.37) 
)

# Plot function & colours
ndist_function <- function(x, mu, sigma, lam) {
  lam * dnorm(x, mu, sigma)
}

colours <- c("blue", "red", "green", "violet")


# Simple single plot
singleYear_df <- fantasy_df[1:4, ]
  
fantasy_df %>%
  ggplot() +
  geom_function(fun = ndist_function,
                args = list(singleYear_df[1, 3], singleYear_df[1, 4], singleYear_df[1, 5]),
                colour = colours[1], lwd = 1.5) +
  geom_function(fun = ndist_function,
                args = list(singleYear_df[2, 3], singleYear_df[2, 4], singleYear_df[2, 5]),
                colour = colours[2], lwd = 1.5) +
  geom_function(fun = ndist_function,
                args = list(singleYear_df[3, 3], singleYear_df[3, 4], singleYear_df[3, 5]),
                colour = colours[3], lwd = 1.5) +
  geom_function(fun = ndist_function,
                args = list(singleYear_df[4, 3], singleYear_df[4, 4], singleYear_df[4, 5]),
                colour = colours[4], lwd = 1.5) +
  xlim(0, 300)+
  ylab("Density")



# Complex wrapped plot
gen1 <- filter(fantasy_df, generation == 1)
gen2 <- filter(fantasy_df, generation == 2)
gen3 <- filter(fantasy_df, generation == 3)
gen4 <- filter(fantasy_df, generation == 4)
  
ggplot(data=fantasy_df) +
  geom_function(fun = ndist_function,
                args = list(gen1$mu, gen1$sigma, gen1$lambda),
                colour = colours[1], lwd = 1.5)+
  geom_function(fun = ndist_function,
                args = list(gen2$mu, gen2$sigma, gen2$lambda),
                colour = colours[2], lwd = 1.5)+
  geom_function(fun = ndist_function,
                args = list(gen3$mu, gen3$sigma, gen3$lambda),
                colour = colours[3], lwd = 1.5)+
  geom_function(fun = ndist_function,
                args = list(gen4$mu, gen4$sigma, gen4$lambda),
                colour = colours[4], lwd = 1.5)+
  facet_wrap(~centralYear, nrow=5)+
  xlim(0, 300)

The single plot for a given year looks fine:

single plot, geom_function

But the combined plot using facet_wrap does not:

combined plot, geom_function and facet_wrap

This is obviously not what I'd like to have. It seems the same function is plotted in every facet. Maybe there's also a different solution than using facet_wrap.

Any help would be very much appreciated!

Quinten
  • 35,235
  • 5
  • 20
  • 53

2 Answers2

3

Adpating this answer to your case this could be achieved via facet_wrap by "overwriting" the centralYear value for geom_function layer with the value of the panel it should be displayed. Additionally instead of adding the layers one by one I use purrr::pmap to loop over your dataset of params to create the function layers. Here the overwriting part is achieved by passing mutate(fantasy_df, centralYear = .env$centralYear) to the data argument.

library(ggplot2)
library(dplyr, warn=FALSE)
library(purrr)

layer_function <- fantasy_df %>%
  mutate(color = colours[generation]) %>%
  pmap(function(mu, sigma, color, lambda, centralYear, ...) {
    geom_function(data = mutate(fantasy_df, centralYear = .env$centralYear),
      fun = ndist_function,
      args = list(mu, sigma, lambda),
      colour = color
    )
  })

ggplot() +
  layer_function +
  facet_wrap(~centralYear, nrow = 5) +
  xlim(0, 300)

stefan
  • 90,330
  • 6
  • 25
  • 51
  • 1
    Very cool approach using `geom_function()` as I had in mind originally. I'm amazed about the fantastic answers I got in the shortest amount of time, using quite different approaches. Thank you! – Moehrengulasch Oct 10 '22 at 10:14
2

I think I would just reshape the data and plot it as a geom_area

library(tidyverse)

fantasy_df %>%
  mutate(centralYear = factor(centralYear, 2017:2013), 
         generation = factor(generation)) %>%
  group_by(centralYear, generation) %>%
  summarize(x = 100:320, Density = lambda * dnorm(x, mu, sigma)) %>%
  ggplot(aes(x, Density, fill = after_scale(color), color = generation)) +
  geom_area(position = 'identity', alpha = 0.3) +
  geom_text(aes(x = 310, y = 0.005, label = centralYear), color = 'black',
            check_overlap = TRUE, fontface = 2) +
  facet_grid(centralYear~.) +
  scale_y_continuous(breaks = 0:1/100) +
  scale_color_manual(values = c('blue', 'red', 'green', 'magenta')) +
  theme_minimal() +
  theme(strip.text = element_blank(), legend.position = 'none')

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Thank you very much, perfect! I've decided to mark this as the accepted answer because it seems to be more in line with what ggplot expects from the data. One small problem I encountered while adopting this to larger datasets: When setting a `ylim`, geom_area might be cropped and no fill is applied. Could there be any workaround for this? [Here](https://i.stack.imgur.com/zzYdP.png) is an example image. – Moehrengulasch Oct 10 '22 at 10:13
  • 1
    @Moehrengulasch If you want to set y limits, do so inside `coord_cartesian`. For example, if you want to set the upper y limit to 0.1, do ` + coord_cartesian(ylim = c(0, 0.1))` . The difference is that this will no remove any data from the underlying calculations, but will simply "zoom in" to the plot. – Allan Cameron Oct 10 '22 at 10:24