5

I would like to add trace of a median line on my box plot.

like this

enter image description here

Here are my plots so far:

enter image description here

library(plotly)
p <- plot_ly(y = ~rnorm(50), type = "box") %>%
  add_trace(y = ~rnorm(50, 1))

p
vestland
  • 55,229
  • 37
  • 187
  • 305
Sup'A
  • 109
  • 2
  • 11
  • Do you mind to produce a full [mcve](/help/mcve)? I can achieve this pretty easily with python and usually the the two versions have the same functionalities. – rpanai Feb 06 '20 at 22:03
  • @Sup'A How did my suggestion work out for you? – vestland Mar 29 '20 at 21:58

2 Answers2

3

Just start out with a scatter plot using plot_ly(..., type='scatter', mode='lines', ...), and follow up with one add_boxplot(...' inherit=FALSE, ...) per box plot. Here's how you do it for an entire data.frame:

enter image description here

Complete code with sample data:

library(dplyr)
library(plotly)

# data
df <- data.frame(iris) %>% select(-c('Species'))
medians <- apply(df,2,median)

# create common x-axis values for median line and boxplots
xVals <- seq(0, length(medians)-1, by=1)

# plotly median line setup
p <- plot_ly(x = xVals, y=medians, type='scatter', mode='lines', name='medians')

# add a trace per box plot
i <- 0
for(col in names(df)){
  p <- p %>% add_boxplot(y = df[[col]], inherit = FALSE, name = col)
  i <- i + 1
}

# manage layout
p <- p %>% layout(xaxis = list(range = c(min(xVals)-1, max(xVals)+1)))
p
vestland
  • 55,229
  • 37
  • 187
  • 305
  • Can i use this trick if i don't have medain column in data. I try to use `medians <- median(c(t1$CTQ))` but it calculate all of row – Sup'A Feb 11 '20 at 09:19
  • @Sup'A We can talk more about this in a [chatroom](https://chat.stackoverflow.com/rooms/207738/room-for-vestland-and-supa) if you'd like. – vestland Feb 13 '20 at 08:41
1

Another option is to use ggplot2 and convert it into plotly

library(ggplot2)
library(dplyr)
library(tidyr)
library(plotly)

p = iris %>% pivot_longer(-Species) %>%
 ggplot(aes(x=name,y=value,col=name)) + 
geom_boxplot() + stat_summary(inherit.aes = FALSE,
aes(x=name,y=value,group=1),fun.y=median,geom="line")
ggplotly(p)

A brief explanation of the code, I use pivot_longer from tidyr to cast the data frame into long format, and first made the boxplot with the column names as x variable and color.

The stat_summary part, I specified again the same x and y variables again, and omitted the colour this time, adding group=1, this tells stat_summary to consider the whole data frame as one group, and to summarize all the y values per x-group, and draw a line through it.

enter image description here

StupidWolf
  • 45,075
  • 17
  • 40
  • 72