I am attempting to create an interactive plot of hexbins where a user can click on a given hexbin and receive a list of all observations of the original data frame that were grouped in that clicked hexbin.
Below is a MWE that seems pretty close to my goal. I am using Shiny, hexbin(), and ggplotly.
app.R
library(shiny)
library(plotly)
library(data.table)
library(GGally)
library(reshape2)
library(hexbin)
ui <- fluidPage(
plotlyOutput("plot"),
verbatimTextOutput("click")
)
server <- function(input, output, session) {
#Create data
set.seed(1)
bindata <- data.frame(x=rnorm(100), y=rnorm(100))
h <- hexbin (bindata, xbins = 5, IDs = TRUE, xbnds = range (bindata$x), ybnds = range (bindata$y))
# As we have the cell IDs, we can merge this data.frame with the proper coordinates
hexdf <- data.frame (hcell2xy (h), ID = h@cell, counts = h@count)
# I have tried different methods of generating the ggplot object
#p <- ggplot(hexdf, aes(x=x, y=y, fill = counts)) + geom_hex(stat="identity")
#p <- ggplot(hexdf, aes(x=x, y=y, fill = ID)) + geom_hex(stat="identity")
#p <- ggplot(hexdf, aes(x=x, y=y, fill = counts, colours = ID)) + geom_hex(stat="identity")
#p <- ggplot(hexdf, colours = ID, aes(x=x, y=y, colours = ID, fill = counts)) + geom_hex(stat="identity")
p <- ggplot(hexdf, aes(x=x, y=y, fill = counts, ID=ID)) + geom_hex(stat="identity")
output$plot <- renderPlotly({
ggplotly(p)
})
d <- reactive(event_data("plotly_click"))
output$click <- renderPrint({
if (is.null(d())){
"Click on a state to view event data"
}
else{
str(d())
#Next line would deliver all observations from original data frame (bindata) that are in the clicked hexbin... if d() from event_data() was returning ID instead of curveNumber
#bindata[which(h@cID==d()$curveNumber),]
}
})
}
shinyApp(ui, server)
Inside the h@cID object is the ID for all data points (showing which data point goes into which hexbin). Hence, I feel that if I am able to get event_data() to return the hexbin ID when a user clicks, then I should be able to successfully map that hexbin ID back to the h@cID object to obtain the corresponding data points.
Unfortunately, the way I have it written currently, event_data() will return "curveNumber" which does not seem to equal ID. It also does not seem to translate into ID (even when using all the information in the h object - not just h@cID, but also more such as h@xcm, h@ycm, etc.)
Is there any method anyone is aware of to solve this type of problem? Any ideas would be appreciated!
Note: My two most recent posts (including a bounty) are very similar to this question. They are located here (Interactive selection in ggplotly with geom_hex() scatterplot) and (Obtain observations in geom_hex using plotly and Shiny). The difference is that I have been making the problem more simple each step. Thank you.
Edit - Possible Answer
I think I may have obtained the solution for this problem. Like what @oshun noticed, there is some hidden conversion between the curveNumber returned from event_data() and the hexbin ID. It seems that curveNumbers are first sorted from smallest to largest by increasing count of the hexbins. Then, within a given count, it seems that curverNumber is further sorted from smallest to largest by increasing ID. However, the ID is sorted by character (not number). For instance, the number 18 would be considered smaller than the number 2 because 18 starts with the digit 1 which is smaller than the digit 2.
You can see this pattern when the full dataset in this example is represented with count, ID, and curveNumber below:
count=1 (ID=24) —> curveNumber 0
count=1 (ID=26) —> curveNumber 1
count=1 (ID=34) —> curveNumber 2
count=1 (ID=5) —> curveNumber 3
count=1 (ID=7) —> curveNumber 4
count=2 (ID=11) —> curveNumber 5
count=2 (ID=14) —> curveNumber 6
count=2 (ID=19) —> curveNumber 7
count=2 (ID=23) —> curveNumber 8
count=2 (ID=3) —> curveNumber 9
count=2 (ID=32) —> curveNumber 10
count=2 (ID=4) —> curveNumber 11
count=3 (ID=10) —> curveNumber 12
count=3 (ID=13) —> curveNumber 13
count=3 (ID=33) —> curveNumber 14
count=3 (ID=40) —> curveNumber 15
count=3 (ID=9) —> curveNumber 16
count=4 (ID=17) —> curveNumber 17
count=4 (ID=20) —> curveNumber 18
count=5 (ID=28) —> curveNumber 19
count=5 (ID=8) —> curveNumber 20
count=6 (ID=21) —> curveNumber 21
count=8 (ID=27) —> curveNumber 22
count=9 (ID=22) —> curveNumber 23
count=11 (ID=16)—> curveNumber 24
count=14 (ID=15)—> curveNumber 25
Below is my tentative solution for this problem. I am pretty sure it works for this this dataset, but I plan to test it on more datasets to be sure.
app.R
library(shiny)
library(plotly)
library(data.table)
library(GGally)
library(reshape2)
library(hexbin)
ui <- fluidPage(
plotlyOutput("plot"),
verbatimTextOutput("click")
)
server <- function(input, output, session) {
# Curve number to ID
cnToID <- function(h){
df <- data.frame(table(h@cID))
colnames(df) <- c("ID","count")
cnID <- df[order(df$count,as.character(df$ID)),]
cnID$curveNumber <- seq(0, nrow(cnID)-1)
return(cnID)
}
# Create data
set.seed(1)
bindata <- data.frame(x=rnorm(100), y=rnorm(100))
h <- hexbin (bindata, xbins = 5, IDs = TRUE, xbnds = range (bindata$x), ybnds = range (bindata$y))
hexdf <- data.frame (hcell2xy (h), ID = h@cell, counts = h@count)
p <- ggplot(hexdf, aes(x=x, y=y, fill = counts, ID=ID)) + geom_hex(stat="identity")
#p <- ggplot(hexdf, aes(x=x, y=y, fill = counts), ID=ID) + geom_hex(stat="identity")
cnID <- cnToID(h)
output$plot <- renderPlotly({
p2 <- ggplotly(p)
for (i in 1:nrow(hexdf)){
p2$x$data[[i]]$text <- gsub("<.*$", "", p2$x$data[[i]]$text)
}
p2
})
d <- reactive(event_data("plotly_click"))
output$click <- renderPrint({
if (is.null(d())){
"Click on a state to view event data"
}
else{
clickID <- as.numeric(as.character(cnID[which(cnID$curveNumber==d()$curveNumber),]$ID))
clickID
bindata[which(h@cID==clickID),]
}
})
}
shinyApp(ui, server)
Edit 2: