4

General Goal

I would like to be able to use RShiny to quickly plot large amounts of data that comes from R, and then make small modifications or additions without re-rendering all of the plotted data.

Specific Task

  1. Plot a large number of points (<100 000) in a scatter plot. I am okay with a short (<5 sec) but perceivable delay in this task.
  2. In response to a mouse click, detect the nearest plotted point.
  3. Using some information queried from data related to this point, highlight a small number of other points (<10). I would like this to be instantaneous.

Current Approach

I currently use ggplot2 and RShiny to make apps to help with data analysis. In general I'm very pleased with this combination. So ideally the solution will allow me to still mostly use these tools.

Using only the built-in functionality of RShiny and ggplot2, I have no problem accomplishing my task, except that step 3 cannot be done independently, without redoing step 1. It is my understanding that it is not possible to update or overlay ggplot2 plots without re-rendering them in their entirety.

So, what I am looking for is one of the following to achieve my general goal, in descending order of preference:

  1. A way to overlay or modify ggplot2 plots without re-rendering.
  2. An extension or fork or similar R-based to ggplot2 that allows this.
  3. An alternative to ggplot2 that is similarly easy to integrate with RShiny and R data that can allow this. Maybe an some interface to an existing javascript library? I would still like to be able to manipulate and interact with my plot using all of the RShiny machinery I am familiar with.

I have some knowledge of js but do not feel like learning something like d3 to accomplish such a small task. (If it's possible to use a small bit of d3 or js to do this, that would be great though!) It would be fine to be able to efficiently draw svg on top of ggplot2 plots, but using the same coordinate system.

I am aware of this question, but the solution provided was specific to time-series data.

Community
  • 1
  • 1
mb7744
  • 390
  • 2
  • 12
  • Have you looked at `plotly`? It has some interactivity. There is also `ggvis`, but it is really not finished yet - apparently. – Mike Wise Dec 20 '16 at 00:00
  • I briefly looked at plotly. My impression was that the ggplotly function was much to slow to accomplish step 1. If there is a way to use plotly with R in some stripped-down way then that could be good. I'm not familiar with ggvis, maybe that's the ticket. – mb7744 Dec 20 '16 at 00:02
  • `ggvis` is Hadley Wickam's next big work and is supposed to add interactivity to ggplot. Unfortunately it turned out to require more restructuring than he thought at first, so it is take a year or two longer than originally announced. – Mike Wise Dec 20 '16 at 00:05
  • You can also draw directly on a ggplot output with grid. But it is not easy and documentation as to the various coordinate systems and layouts is sparse at best (mostly you need to look at the code to figure it out). – Mike Wise Dec 20 '16 at 00:07
  • I'd be willing to put some work into doing it with grid because then the rest of my workflow can remain exactly the same. Is there a reference or small example you can provide me with to get me started on modifying an existing ggplot2 plot with grid? I can start reading the entire grid documentation but that seems like overkill. – mb7744 Dec 20 '16 at 00:17
  • Not only overkill, but probably not very productive since grid documentation does not care about ggplot conventions. – Mike Wise Dec 20 '16 at 00:23
  • 1
    So have a look at the github source code for ggplot2 (it is not your usual R code). Particularly have a look at the `gridExtra` library from the github user baptiste. In that library you will see a lot of things you will need to understand. – Mike Wise Dec 20 '16 at 00:29
  • I would start with understanding how `grid.arrange` works. Always meant to do that myself... – Mike Wise Dec 20 '16 at 00:44

1 Answers1

0

Here is a solution with plotly. It does re-render the entire plot, but it's fast so perhaps will still meet your requirements. I think you'll see that introducing Plotly should not majorly disrupt your workflow.

Note that I use Plotly's WebGL function for speed. The example below is 100000 points. I've also included an example of how you would convert your existing ggplot2object. For Plotly click events, see this.

library(shiny)
library(dplyr)
library(plotly)
library(ggplot2)

ui <- fluidPage(

  titlePanel("Highlight nearby points"),

  sidebarLayout(
    sidebarPanel(width=3,
      p("Click on a point. Nearby points will be highlighted.")
    ),

    mainPanel(
      plotlyOutput("plot")
    )
  )
)

# Data
df <- tibble(x = runif(1e+05,1,100), y = runif(1e+05,1,100))

server <- function(input, output, session) {

  output$plot <- renderPlotly({

    # Gather click data
    event.data <- event_data("plotly_click")

    # Plotly object
    p <- plot_ly(df, x = ~x, y = ~y, type = "scatter", mode = "markers") 

    # Alternative: use existing ggplot

    # gg <- ggplot(df, aes(x = x, y = y)) +
    #   geom_point()
    # 
    # p <- plotly_build(gg)

    # End alternative

    # Check for click data
    if(!is.null(event.data)) {

      # If click data exists, create new markers based on range criteria and use a different color
      d <- filter(df,
                  x < event.data$x+10 & x > event.data$x-10,
                  y < event.data$y+10 & y > event.data$y-10)
      p <- add_markers(p, data = d, color = I("red"))

    }

    # Use webGL for faster ploting of many points
    p %>% toWebGL()

  })
}

# Run the application 
shinyApp(ui = ui, server = server)
Vance Lopez
  • 1,338
  • 10
  • 21
  • I very much appreciate the effort, however at least for me this solution has worse performance than the same thing without plotly. If plotly still requires the plot to be rebuilt, I don't see what its use has added in regards to my problem. – mb7744 Dec 20 '16 at 19:44
  • Compare performance with this: https://gist.github.com/anonymous/380198a563e9dfce27a72b8c468315a2 – mb7744 Dec 20 '16 at 20:43
  • @mb7744 Interesting, it actually was a bit slower for me. Okay I'll think about this some more, but I'm not sure how you'd output a plot object without re-rendering. You need to invalidate the render object in order to update the output. In doing so, it would run all the code inside the render. – Vance Lopez Dec 20 '16 at 21:54