4

I am trying to identify why I have a purple line appearing along the x axis that is the same color as "Prypchan, Lida" from my legend. I took a look at the data and do not see any issues there.

ggplot(LosDoc_Ex, aes(x = LOS)) +
  geom_density(aes(colour = AttMD)) +
  theme(legend.position = "bottom") +
  xlab("Length of Stay") +
  ylab("Distribution") +
  labs(title = "LOS Analysis * ",
       caption = "*exluding Residential and WSH",
       color = "Attending MD: ")

LOS Analysis by Doc

Claus Wilke
  • 16,992
  • 7
  • 53
  • 104
Luke Baker
  • 41
  • 2
  • 2
    Welcome to stack overflow! As it stands, this question is off topic since it is asking for debugging help without a reproducible example. Please put together a [minimal, complete, and verifiable example](https://stackoverflow.com/help/mcve) and edit your question to include it. Some R specific tips for that are here: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – De Novo Apr 01 '18 at 00:15

2 Answers2

5

Usually I'd wait for a reproducible example, but in this case, I'd say the underlying explanation is really quite straightforward:

geom_density() creates a polygon, not a line.

Using a sample dataset from ggplot2's own package, we can observe the same straight line below the density plots, covering the x-axis & y-axis. The colour of the line simply depends on which plot is on top of the rest:

p <- ggplot(diamonds, aes(carat, colour = cut)) +
  geom_density()

plot

Workaround 1: You can manually calculate the density values yourself for each colour group in a new data frame, & plot the results using geom_line() instead of geom_density():

library(dplyr)
library(tidyr)
library(purrr)
diamonds2 <- diamonds %>%
  nest(-cut) %>%
  mutate(density = map(data, ~density(.x$carat))) %>%
  mutate(density.x = map(density, ~.x[["x"]]),
         density.y = map(density, ~.x[["y"]])) %>%
  select(cut, density.x, density.y) %>%
  unnest()

ggplot(diamonds2, aes(x = density.x, y = density.y, colour = cut)) +
  geom_line()

plot with new data frame

Workaround 2: Or you can take the data generated by the original plot, & plot that using geom_line(). The colours would need to be remapped to the legend values though:

lp <- layer_data(p)
if(is.factor(diamonds$cut)) {
  col.lev = levels(diamonds$cut) 
} else {
  col.lev = sort(unique(diamonds$cut))
}
lp$cut <- factor(lp$group, labels = col.lev)

ggplot(lp, aes(x = x, y = ymax, colour = cut)) +
  geom_line()

plot with data frame from original plot

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
  • Thank you very much. This is my first stack post and will provide a reproducible example in the future. Your workarounds worked great! – Luke Baker Apr 03 '18 at 15:32
  • Another workaround is `geom_density_line()` from the ggridges package. I wrote it because I was tired of needing workarounds for this issue. https://stackoverflow.com/a/53773892/4975218 – Claus Wilke Dec 14 '18 at 05:26
  • Or simply use `geom_line()` with `stat = "density"` if you don't want the filled area. – Claus Wilke Dec 14 '18 at 16:20
4

There are two simple workarounds. First, if you only want lines and no filled areas, you can simply use geom_line() with the density stat:

library(ggplot2)
ggplot(diamonds, aes(x = carat, y = stat(density), colour = cut)) +
  geom_line(stat = "density")

Note that for this to work, we need to set the y aesthetic to stat(density).

Second, if you want the area under the lines to be filled, you can use geom_density_line() from the ggridges package. It works exactly like geom_density() but draws a line (with filled area underneath) rather than a polygon.

library(ggridges)
ggplot(diamonds, aes(x = carat, colour = cut, fill = cut)) +
  geom_density_line(alpha = 0.2)

Created on 2018-12-14 by the reprex package (v0.2.1)

Claus Wilke
  • 16,992
  • 7
  • 53
  • 104