6

I want to create an interactive bar chart that lets users filter observations based on a range of values, and then renders counts per class for the selected time period dynamically. Since the filtered data needs to be available for numerous such graphs, I thought a combination of crosstalk and plotly/ggplot might prove valuable.

I attached a reprex further below that uses shared data and filtering functionality from crosstalk to allow for the dynamic filtering part. When I knit the document, the bar chart renders nicely as long as the full range of values is selected (default).

full

However, the plotting region becomes empty for any other, ie. user-adjusted range.

partial

What exactly am I missing here? I assume there must be a difference between full and filtered shared datasets that ggplotly() cannot handle proberly. Is there maybe another approach that I could follow to achieve my goal?

Here's the content of my .Rmd file:

---
title: mpg class counts filtered by time period
output: html_document
---

```{r echo = FALSE, message = FALSE, warning = FALSE}
library(crosstalk)
library(plotly)

# Wrap data frame in SharedData
sd = SharedData$new(mpg)

# Create a filter input
filter_slider("Year", "Year", sd, column = ~ year, step = 1, width = 250)

# Render graph
bscols(
  ggplotly(
    ggplot(aes(x = class), data = sd) + 
      geom_bar()
  )
)

```
fdetsch
  • 5,239
  • 3
  • 30
  • 58

2 Answers2

0

I think this may be because "Crosstalk currently only works for linked brushing and filtering of views that show individual data points, not aggregate or summary views (where “observations” is defined as a single row in a data frame). For example, histograms are not supported since each bar represents multiple data points; but scatter plot points each represent a single data point, so they are supported." official doc

If you change it to point plot, it seems to be working.

---
title: mpg class counts filtered by time period
output: html_document
---

```{r echo = FALSE, message = FALSE, warning = FALSE}
library(data.table)
library(crosstalk)
library(plotly)

# Wrap data frame in SharedData
sd = SharedData$new(mpg)

# Create a filter input
filter_slider("Year", "Year", sd, column = ~ year, step = 1, width = 250)

# Render graph
bscols(
  ggplotly(
    ggplot(aes(hwy, cty), data = sd) + 
      geom_point()
  )
)

```
Jakub.Novotny
  • 2,912
  • 2
  • 6
  • 21
  • Thanks for sharing your thoughts. But why would it work for the full time period, then? After all, the bars in the default view are already an abstraction of the dataset. – fdetsch Aug 11 '20 at 13:53
  • I have added a modification of your code to support my claim. As for why it works for the full time period, I don't know, I am sorry. – Jakub.Novotny Aug 11 '20 at 13:56
  • Yeah, that's what I would expect for raw data points in a scatter plot. I am more interested in why summary views (like bar charts) are produced for full, but not for filtered data. – fdetsch Aug 11 '20 at 14:03
0

Would following work for you? If you want to filter for dates, you might want to have a look at plotly::rangeslider.

library(tidyverse)
library(plotly)

df <- crosstalk::SharedData$new(mpg)$data() %>%
  group_by(year, class) %>%
  count() %>%
  mutate(year = as.factor(year))
  
df %>%
  plot_ly(x = ~class, y = ~n, color = ~year) %>%
  add_bars() %>%
  layout(barmode = "stack")
Jakub.Novotny
  • 2,912
  • 2
  • 6
  • 21
  • Thanks again for the effort. In my real life use case, I am dealing with daily observations over a long time period, so manually deselecting dates seems not to be feasible. As far as I overlook `rangeslider`, I believe that it doesn't allow to subset my input data permanently. As said, there's a lot of figures coming afterwards that all depend on the filtered data. So a solution using `SharedData` or the like would be preferable. – fdetsch Aug 11 '20 at 15:41