1

I'm struggling to modifing the colour/shape/... of the points based of if it's a missing value or not.

library(ggplot2)
library(naniar)
ggplot(data = airquality,
       aes(x = Ozone,
           y = Solar.R)) +
  geom_miss_point()

What I have

airquality_no_na <-airquality[!(is.na(airquality$Ozone) | is.na(airquality$Solar.R)) ,]
airquality_na <-airquality[(is.na(airquality$Ozone) | is.na(airquality$Solar.R)),]
ggplot() +
  geom_point(data = airquality_no_na,
                  aes(x = Ozone,
                      y = Solar.R, colour = "NoMissing")) +
  geom_miss_point(data = airquality_na,
                  aes(x = Ozone,
                      y = Solar.R, colour = "Missing")) +
  scale_colour_manual(name = 'Legende', 
                    values =c('NoMissing'='green',
                              'Missing'='blue'))


What I would like to have

I don't know how to make the missing value in green and the non-missing value in blue without spliting in two dataframe.

EDIT :

My issue was a bit more complexe. I want to have the possibility to choose the color for the first data set (missing in blue, not missing in green) ans the second data set (missing in red, not missing in yellow)

#Create dataframes
df1=as.data.frame(matrix(data=runif(n=200, 0,1),ncol=2))
df2=as.data.frame(matrix(data=runif(n=100, 0,1),ncol=2))
#Add missing values
df1[rbinom(n=100,size=1,prob = 0.1) ==1,1] <- NA
df1[rbinom(n=100,size=1,prob = 0.1) ==1,2] <- NA
df2[rbinom(n=50,size=1,prob = 0.1) ==1,1] <- NA
df2[rbinom(n=50,size=1,prob = 0.1) ==1,2] <- NA

#This doesnt work. It only print in blue (missing) and green (not missing)
ggplot() +
  geom_miss_point(data = df1,
                  aes(x = V1,
                      y = V2)) +
  geom_miss_point(data = df2,
                  aes(x = V1,
                      y = V2)) +
  scale_colour_manual(values = c("blue", "green", "yellow","red"))

tjebo
  • 21,977
  • 7
  • 58
  • 94
TrucMachin
  • 15
  • 4

2 Answers2

1

I am not sure if this a good idea. But for the sake of "showing how to do this in theory". From what I understand from a quick look into the naniar package, is that the color aesthetic is mapped to ..missing.. by default. You would need to dig quite a lot into the actual geom to change that behaviour. But there is a simple workaround for it.

Create a second color scale with ggnewscale.

You will not get around subsetting your data first, but this is not a bad thing. Don't fear to subset your data, that's a very normal thing to do.

library(tidyverse)
library(naniar)
library(ggnewscale)

ggplot() +
  geom_miss_point(data = df1, aes(V1, V2)) +
  scale_colour_manual(name = "df1", values = c("blue", "green")) +
  new_scale_color() +
  geom_miss_point(data = df2, aes(V1, V2)) +
  scale_colour_manual(name = "df2", values =  c("yellow","red"))

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • 1
    Thank you. Really easy to use it. I wouldn't have find it alone. I don't find a way to modify the shape/size with the same workaround, but i'll do without it. – TrucMachin Jan 11 '21 at 21:01
1

With some trial and error I came up with a solution using the group aesthetic:

  1. Row bind your datasets and add an identifier
  2. Map the dataset identifier on group
  3. Map the interaction of ..group.. and naniars ..missing.. on color. (I first tried by using dataset directly but that did not work. ): )
library(ggplot2)
library(naniar)

set.seed(42)

#Create dataframes
df1=as.data.frame(matrix(data=runif(n=200, 0,1),ncol=2))
df2=as.data.frame(matrix(data=runif(n=100, 0,1),ncol=2))
#Add missing values
df1[rbinom(n=100,size=1,prob = 0.1) ==1,1] <- NA
df1[rbinom(n=100,size=1,prob = 0.1) ==1,2] <- NA
df2[rbinom(n=50,size=1,prob = 0.1) ==1,1] <- NA
df2[rbinom(n=50,size=1,prob = 0.1) ==1,2] <- NA

dplyr::bind_rows(df1, df2, .id = "dataset") %>% 
  ggplot() +
  geom_miss_point(aes(x = V1,
                      y = V2, 
                      group = dataset,
                      colour = interaction(..group.., ..missing..))) +
  scale_colour_manual(values = c("blue", "red", "green", "yellow"))

stefan
  • 90,330
  • 6
  • 25
  • 51
  • nice, I was trying a similar approach, but I got stuck with the call to interaction, using ..group.. is great. – tjebo Jan 11 '21 at 19:43
  • 1
    @tjebo Thanks. I first tried with using the identifier directly in the `interaction` which normally works but ... always got an error. So just tried with `group` and voila. ((: – stefan Jan 11 '21 at 19:45