0

I am working with the R programming language. I made the following graph that shows a scatterplot between points of two different colors :

library(ggplot2)


a = rnorm(10000,10,10)
b = rnorm(10000, 10, 10)
c = as.factor("red")

data_1 = data.frame(a,b,c)

a = rnorm(10000,7,5)
b = rnorm(10000, 7, 5)
c = as.factor("blue")

data_2 = data.frame(a,b,c)

final = rbind(data_1, data_2)

my_plot = ggplot(final, aes(x=a, y=b, col = c)) + geom_point() +  theme(legend.position="top") + ggtitle("My Plot")

enter image description here

My Question: Is there a way to "change the colors of overlapping points"?

Here is what I tried so far:

1) I found the following question (Visualizing two or more data points where they overlap (ggplot R)) and tried the strategy suggested:

linecolors <- c("#714C02", "#01587A", "#024E37")
fillcolors <- c("#9D6C06", "#077DAA", "#026D4E")

# partially transparent points by setting `alpha = 0.5`
ggplot(final, aes(a,b, colour = c, fill = c)) +
  geom_point(alpha = 0.5) +
  scale_color_manual(values=linecolors) +
  scale_fill_manual(values=fillcolors) +
  theme_bw()

enter image description here

This shows the two different colors along with the overlap, but it is quite dark and still not clear. Is there a way to pick better colors/resolutions for this?

2) I found the following link which shows how to make color gradients for continuous variables : https://drsimonj.svbtle.com/pretty-scatter-plots-with-ggplot2 - but I have discrete colors and I do not know how to apply this

enter image description here

3) I found this question over here (Any way to make plot points in scatterplot more transparent in R?) which shows to do this with the base R plot, but not with ggplot2:

addTrans <- function(color,trans)
{
  # This function adds transparancy to a color.
  # Define transparancy with an integer between 0 and 255
  # 0 being fully transparant and 255 being fully visable
  # Works with either color and trans a vector of equal length,
  # or one of the two of length 1.

  if (length(color)!=length(trans)&!any(c(length(color),length(trans))==1)) stop("Vector lengths not correct")
  if (length(color)==1 & length(trans)>1) color <- rep(color,length(trans))
  if (length(trans)==1 & length(color)>1) trans <- rep(trans,length(color))

  num2hex <- function(x)
  {
    hex <- unlist(strsplit("0123456789ABCDEF",split=""))
    return(paste(hex[(x-x%%16)/16+1],hex[x%%16+1],sep=""))
  }
  rgb <- rbind(col2rgb(color),trans)
  res <- paste("#",apply(apply(rgb,2,num2hex),2,paste,collapse=""),sep="")
  return(res)
}


cols <- sample(c("red","green","pink"),100,TRUE)


# Very transparant:
plot(final$a , final$b ,col=addTrans(cols,100),pch=16,cex=1)

enter image description here

But this is also not able to differentiate between the two color classes that I have.

Problem: Can someone please suggest how to fix the problem with overlapping points, such that the overlap appear more visible?

Thanks!

stats_noob
  • 5,401
  • 4
  • 27
  • 83
  • 2
    This is not what you're looking for but would it make ense to plot histograms/desnity plots for the x and y-axis to illustrate the overlap? – JAQuent Dec 08 '21 at 04:59
  • @ JAQuent: Thank you for your reply! I considered histograms in the past - however, I trying to show individual points was hoping to do this with a histogram. I will keep looking for a way to do this! Thank you! – stats_noob Dec 08 '21 at 05:02
  • sounds like the ggpointdensity package might be useful to you. It’s quite slow in plotting, but makes nice visual effects. https://github.com/LKremer/ggpointdensity – tjebo Dec 08 '21 at 07:48
  • 1
    also, please check this thread, I think it has some nice ideas for dealing with lots of data points https://stackoverflow.com/questions/7714677/scatterplot-with-too-many-points – tjebo Dec 08 '21 at 07:50

1 Answers1

1

I would use a density heatmap

ggplot(final, aes(x=a, y=b, col = c))+
    stat_density_2d(aes(fill = stat(density)), geom = 'raster', contour = FALSE) +       
      scale_fill_viridis_c() +
      coord_cartesian(expand = FALSE) +
      geom_point(shape = '.', col = 'white')

enter image description here

or

ggplot(final, aes(x=a, y=b, col = c))+
stat_density_2d(aes(fill = stat(level)), geom = 'polygon') +
  scale_fill_viridis_c(name = "density") +
  geom_point(shape = '.')

enter image description here

or

ggplot(final, aes(x=a, y=b, col = c))+
  geom_point(alpha = 0.1) +
  geom_rug(alpha = 0.01)

enter image description here

Rfanatic
  • 2,224
  • 1
  • 5
  • 21