0

I attempt to create a line plot with data points using ggplot. I would like to assign the colour of points based on quartile based cut-off values. geom_point did not plot any points from summary statistics and issued a warning message "removed the rows containing missing values." The same code with numerical value explicitly included (not from summary statistics computation), the plots are nicely done. Appreciate if you could suggest the ways to fix the issue. The following is the reproducible code:

patient <- cbind.data.frame(seq(1:14),matrix(sample(1:100, 84), ncol=6))
colnames(patient) <- c('DAYS', 'PHYSICAL_ACTIVITY', 'SMOKING', 'ALCOHOL_INTAKE', 'HYDRATION', 'SLEEP', 'Total_score')

ggplot(data=patient, aes(x=DAYS,y=SLEEP)) +
  geom_line(colour='black', size=1) +
  geom_point(size=3,aes(colour=cut(SLEEP, c(-Inf,summary(SLEEP)[[2]],summary(SLEEP)[[5]],Inf))), show.legend=F) +
  scale_color_manual(values = c("(-Inf,summary(SLEEP)[[2]]]" = "green", "(summary(SLEEP)[[2]],summary(SLEEP)[[5]]]" = "orange", "(summary(SLEEP)[[5]], Inf]" = "red")) +
  theme(axis.title.y=element_blank()) +
  theme(axis.title.x=element_blank(), axis.text.x=element_blank(),axis.ticks.x=element_blank()) +
  ggtitle("SLEEP (hrs)")+ theme(panel.background = element_blank()) +
  guides(fill=FALSE)+ theme(plot.title = element_text(size = 8, face = "bold"))

Thanks

tifu
  • 1,352
  • 6
  • 17
Eddie S
  • 5
  • 1
  • 4
  • Possible duplicate of [Geom\_jitter colour based on values](https://stackoverflow.com/questions/51159196/geom-jitter-colour-based-on-values) – Scransom Jul 05 '18 at 05:46

2 Answers2

1

Look at the output of cut(...):

> cut(patient$SLEEP, c(-Inf, summary(patient$SLEEP)[[2]], summary(patient$SLEEP)[[5]], Inf))
[1] (62.5, Inf] (22.8,62.5] (62.5, Inf] (62.5, Inf] (-Inf,22.8] (22.8,62.5]
[7] (-Inf,22.8] (22.8,62.5] (-Inf,22.8] (22.8,62.5] (62.5, Inf] (-Inf,22.8]
[13] (22.8,62.5] (22.8,62.5]
Levels: (-Inf,22.8] (22.8,62.5] (62.5, Inf]

So ggplot expect those values in scale_color:

values = c('(-Inf,22.8]' = 'green', '(22.8,62.5]' = 'orange', '(62.5, Inf]' = 'red')

But you don't need to pass the levels, just the colors in the correspondent order:

values = c('green', 'orange', 'red')

You also don't need all those repeated theme lines:

ggplot(patient, aes(DAYS, SLEEP)) +
  geom_line() +
  geom_point(
    aes(colour = cut(SLEEP, c(-Inf, summary(SLEEP)[[2]], summary(SLEEP)[[5]], Inf))),
    size = 3, show.legend = FALSE ) +
  scale_color_manual(values = c('green', 'orange', 'red')) +
  labs(title = 'SLEEP (hrs)', x = NULL, y = NULL) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 8, face = 'bold'),
    panel.grid = element_blank() )

enter image description here

0

It this what you're after?

library(tidyverse)
patient %>%
    mutate(Quartile = cut(
        SLEEP,
        c(-Inf, quantile(SLEEP)[2], quantile(SLEEP)[3], quantile(SLEEP)[4], Inf),
        labels = c("1st", "2nd", "3rd", "4th"))) %>%
    ggplot(aes(DAYS, SLEEP, colour = Quartile)) +
    geom_point(show.legend = F, size = 3) +
    geom_line(aes(colour = Quartile)) +
    scale_colour_manual(values = c(
        "1st" = "blue",
        "2nd" = "green",
        "3rd" = "orange",
        "4th" = "red")) +
    theme(
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.ticks.x = element_blank(),
        panel.background = element_blank(),
        plot.title = element_text(size = 8, face = "bold")) +
    guides(fill = FALSE) +
    ggtitle("SLEEP (hrs)")

enter image description here

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68