2

I am dealing with a dataset which is as follows

Age = sample(10:99, 50, replace=T)
Level = sample( LETTERS[1:4], 50, replace=TRUE )
df = as.data.frame(cbind(Age, Level))

This is my boxplot with jitters for the variable Age

library(plotly)
plot_ly(y = ~df$Age, type = "box", boxpoints = "all", jitter = 0.3,
        pointpos = -1.8)

enter image description here

My question is, how do i color the jitter points on the left differently based on the level variables? Right now there are four levels in my dataset , A, B, C, D. Points corresponding to level A should be of certain color, points corresponding to level B should be in different color so on.

I tried

plot_ly(y = ~df$Age, type = "box", boxpoints = "all", jitter = 0.3, color = ~df$Level,pointpos = -1.8)

This is giving me four different boxplots. My goal is just one boxplot with the jitters colored based on the level variable. So any suggestions or help is much appreciated.

Science11
  • 788
  • 1
  • 8
  • 24

1 Answers1

3

I am not sure if it's possible, but here is an alternative. Maybe you could use subplot and combine a scatter and box plots together.

First create a dummy x variable for your scatterplot.

df$AgeX <- rnorm(50, 2, 0.3)

And then combine both plots

p1 <- plot_ly(df, y = ~Age, x=~AgeX) %>%
  add_markers(name = ~"jitter", color=~Level) %>% layout(xaxis = b1y)
p2 <- plot_ly(df, y = ~Age) %>%
  add_boxplot(name = ~"boxplot") 
p <- subplot(p1, p2, shareY = TRUE, widths = c(0.2, 0.8), margin = 0)
p

enter image description here

You can remove the legend using %>% hide_legend() and you just need to play with the margin and width to get what you really want.

MLavoie
  • 9,671
  • 41
  • 36
  • 56