4

I found a way to "hack" ggplot by combining two geom_area plots to create a normal distribution with a tail area:

library(ggplot2)
mean <-  0
standard_deviation <- 1
Zscore <- -1.35

observation = (Zscore*standard_deviation) + mean
(tail_area <- round(pnorm(observation),2))

ggplot(NULL, aes(c(-5,5))) +
    geom_area(stat = "function", fun = dnorm, fill="sky blue", xlim = c(-5, -1.35)) +
    geom_area(stat = "function", fun = dnorm,  xlim = c(-1.35, 5))

enter image description here

Is there "not so hackey" approach using ggplot to create normal distributions and highlighting tail areas like above?

mnm
  • 1,962
  • 4
  • 19
  • 46
Angel Cloudwalker
  • 2,015
  • 5
  • 32
  • 54
  • For information, there's an alternative but still 'hackey' approach given here https://stackoverflow.com/questions/12429333/how-to-shade-a-region-under-a-curve-using-ggplot2. – Chris Feb 29 '20 at 09:54

1 Answers1

8

First off, I like your approach; not sure whether this is less "hackey", but here's another option using gghighlight

# Generate data (see comment below)
library(dplyr)
df <- data.frame(x = seq(-5, 5, length.out = 100)) %>% mutate(y = dnorm(x))

# (gg)plot and (gg)highlight
library(ggplot2)
library(gghighlight)
ggplot(df, aes(x, y)) + geom_area(fill = "sky blue") + gghighlight(x < -1.35)

enter image description here

From what I understand, gghighlight needs a data argument, so it won't work with geom_area by itself (meaning: without data but with stat = "function"), or with stat_function. That's why I'm generating data df first.


Update

In response to your comment about how to "highlight the area between 1 and -1"; you can do the following

ggplot(df, aes(x, y)) + geom_area(fill = "sky blue") + gghighlight(abs(x) < 1)

enter image description here

Update 2

To highlight the region 1.5 < x < 2.5 simply use the conditional statement x > 1.5 & x < 2.5

ggplot(df, aes(x, y)) + geom_area(fill = "sky blue") + gghighlight(x > 1.5 & x < 2.5)

enter image description here


To pre-empt potential follow questions: This method will only work for contiguous regions. Meaning, I haven't found a way to highlight x < -2.5 & x > 2.5 in a single gghighlight statement.

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • This is great! How would I tell gghighlight to only highlight the area between 1 and -1 though? – Angel Cloudwalker Feb 29 '20 at 19:03
  • I find I have to make multiple geom_area and gghighlight to get two points. – Angel Cloudwalker Feb 29 '20 at 22:30
  • @MilesMorales I've made an edit to my post, please take a look. To highlight between 1 and -1 simply use the condition `abs(x) < 1` . – Maurits Evers Mar 01 '20 at 04:41
  • So you are using 1 conditional statement for one area - but I'm interested in how can you shade the area of two places you can't do in one conditional statement, like <2.5 and >1.5, this is the issue I'm running into. – Angel Cloudwalker Mar 01 '20 at 06:40
  • @MilesMorales I’m confused. You asked how to *"highlight the area between 1 and -1"* in your earlier comment. That’s what I’m showing. – Maurits Evers Mar 01 '20 at 06:56
  • @MilesMorales You seem to be changing your problem statements again and again. First it was a single tail area. Then it was the area between 1 and -1. Now it's the area <2.5 and >1.5. For future posts, please post your question with a clear problem statement right from the start. I've made another edit showing you how to highlight the area <2.5 and >1.5. To pre-empt another question: From what I can tell, the `gghighlight` approach will only work for contiguous areas. Meaning you can't e.g. highlight the positive and negative tail ends in a single call. I hope this helps! – Maurits Evers Mar 01 '20 at 08:33
  • It helps and yes that's what I wanted to know if I could highlight positive and negative tail ends but I guess not. Also I'm not changing my original question - it still stands as there is no chart in ONLY ggplot where I can handle highlighting and creating normal distributions without some tinkering, I was just hoping you'd be helpful enough to expand a bit on the gghighlight functionality since I've never used it before, and you've been very helpful - thanks. – Angel Cloudwalker Mar 01 '20 at 20:00
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/208808/discussion-between-milesmorales-and-maurits-evers). – Angel Cloudwalker Mar 01 '20 at 20:07
  • @MilesMorales **You have changed your question 4 times**: First you ask about highlighting a single tail only; then about highlighting between -1 and 1; then about highlighting <2.5 and >1.5; and now you ask about highlighting both tails. **Those are your words/comments, not mine!** If you had asked about highlighting two tails right from the start, I would not have bothered to answer because I don't think there's a lean solution to achieve that with `gghighlight`. I'm happy to help, but you need to understand that this is an awful way of asking for help. It wastes everyone's time. – Maurits Evers Mar 01 '20 at 21:06
  • The original question has never been changed, and your solution wasn't a solution to the question just like mine wasn't, yours was just an alternative like mine. I was simply having an exchange with you you about your alternative approach, but you were not at all obligated to continue that exchange if all you wanted was an acceptance as an answer for the original question and felt it was a waste of time. I opened up a discussion in chat as well but you wanted to continue the back and forth here. Anyhow still appreciate you for for sharing, thanks! – Angel Cloudwalker Mar 02 '20 at 20:03