Setting up a conditional group_by

Question

I have a set of data that looks as such:

+----------+------------+-------+-------+
|  step1   |   step2    | step3 | step4 |
+----------+------------+-------+-------+
| Region 1 | District A | 1A    |   571 |
| Region 1 | District A | 1A    |   356 |
| Region 1 | District A | 1B    |   765 |
| Region 1 | District B | 1B    |   752 |
| Region 2 | District C | 2C    |   885 |
| Region 2 | District C | 2D    |    73 |
| Region 2 | District D | 2D    |   241 |
| Region 2 | District D | 2D    |   823 |
| Region 3 | District E | 3E    |   196 |
| Region 3 | District E | 3E    |   103 |
| Region 3 | District F | 3E    |   443 |
| Region 3 | District F | 3F    |   197 |
+----------+------------+-------+-------+

I have setup the following script, which in the manner it is written, uses the selectizeGroupServer to automatically setup filtering between step1, step2, and step3 so they're linked together (i.e. If you filter for Region 1 it will only return the relevant options in Step2 and Step3.

The script below returns the results I'm looking for if you wanted it to group_by_all in a straight forward manner. So on initial run, it will show the graphed output of all 11 results. If I filter by Region 1, it will return a graph of all four figures in step4 linked to Region 1.

But I want to set it up in a way where when I select an option, it will actually group by the hierarchy option below it. So if I filter by Region 1, it will instead return two columns: The summed aggregate of District A (1692) and the summed aggregate of District B (752). If I have both Region 1 AND District A selected, it would return two columns: The aggregate of 1A (927) and the aggregate of 1B that is tied to District A (765).

How may I set it up in a way that accomplishes this?

library(highcharter)
library(shiny)
library(shinyWidgets)
library(dplyr)

step1 <- c('Region 1', 'Region 1', 'Region 1', 'Region 1', 'Region 2', 'Region 2', 'Region 2', 'Region 2', 'Region 3', 'Region 3', 'Region 3', 'Region 3')
step2 <- c('District A', 'District A', 'District A', 'District B', 'District C', 'District C', 'District D', 'District D', 'District E', 'District E', 'District F', 'District F')
step3 <- c('1A', '1A', '1B', '1B', '2C', '2D', '2D', '2D', '3E', '3E', '3E', '3F')
step4 <- c(571,356,765,752,885,73,241,823,196,103,443,197)

ui <- fluidPage(
  fluidRow(
    column(
      width = 5, offset = 1,
      panel(
        selectizeGroupUI(
          id = "foo",
          params = list(
            Step1 = list(inputId = "step1", title = "Step1:"),
            Step2 = list(inputId = "step2", title = "Step2:"),
            Step3 = list(inputId = "step3", title = "Step3:")
          ))
      ),
      highchartOutput(outputId = "table")
    )
  )
)

server <- function(input, output, session) {

  abc <- callModule(
    module = selectizeGroupServer,
    id = "foo",
    data = df,
    vars = c("step1", "step2", "step3")
  )

  output$table <- renderHighchart({

    bar <- abc()

    xyz <- bar %>% filter(is.null(input$step1) | step1 %in% input$step1,
                        is.null(input$step2) | step2 %in% input$step2,
                        is.null(input$step3) | step3 %in% input$step3) %>% group_by_all() %>% summarise(results = sum(step4))


    highchart() %>% hc_add_series(data = xyz, type = "column", hcaes(y = results),
                                  showInLegend = TRUE) %>% hc_add_theme(hc_theme_flat())


  })


}

Thanks!

Disclosure: [this question](https://stackoverflow.com/questions/52023709/what-can-r-do-about-a-messy-data-format) was asked by me. — Rui Barradas, Dec 14 '19 at 23:28
@akrun - what do you mean by dput? You can see what df looks like in the table above. — , Dec 14 '19 at 23:31
@akrun just edited the script to make it easy to throw into a dataframe or however you think is best at setting it up — , Dec 14 '19 at 23:39
Ah, yes, of course. Thanks for catching that. Edited my initial post. — , Dec 14 '19 at 23:43
One thing I am trying to understand is how the `input`s are getting changed here. In your case, you are doing the `filter` with `|`. If you need a hierarchial group, it is much simpler with `library(data.table);dt1 <- as.data.table(df1); rollup(dt1, j = sum(step4), by = c("step1", "step2", "step3"))` and then do the `filter` — akrun, Dec 15 '19 at 00:02
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/204275/discussion-between-gooponyagrinch-and-akrun). — , Dec 15 '19 at 00:05
@akrun Whatever you think is the best way to answer it, you're welcome to make whatever changes you want! — , Dec 15 '19 at 00:06
@akrun you seem to know this much better than I do, so I'm more than open to hearing how you would write it. Don't worry about fully abiding my initial script. — , Dec 15 '19 at 00:12
sorry, I tried it, I have some other things came up. I have to go. I hope you get a response from somebody or else will try later. Thank you — akrun, Dec 15 '19 at 00:56
@akrun Thanks. I'll keep trying, but I'll probably need your help (assuming someone else doesn't answer). — , Dec 15 '19 at 01:38

score 2 · Answer 1 · answered Dec 16 '19 at 13:27

First, we need to find out which column to group by. In this case, I assume it is the first column with more than 1 option. The rest of the code is pretty similar, except for the group_by_all being replaced by group_by_at.

output$table <- renderHighchart({

        bar <- abc()

        # find out which column to group by (first column with more than 1 distinct value)
        summ_column <- bar %>%
            summarise_all(~ length(unique(.))) %>% {colnames(.)[.>1]} %>% first()

        xyz <- bar %>% group_by_at(summ_column) %>% summarise(results = sum(step4))


        highchart() %>% hc_add_series(data = xyz, type = "column", hcaes(y = results),
                                      showInLegend = TRUE) %>% hc_add_theme(hc_theme_flat())


    })

This will not work if you select more than 1 value for a single option, but that solution should be pretty similar.

score 2 · Answer 2 · answered Dec 16 '19 at 13:37

Seems like you are looking for aggregate. Please check the following:

library(highcharter)
library(shiny)
library(shinyWidgets)
# library(dplyr)

DF <- data.frame(
  step1 = c('Region 1', 'Region 1', 'Region 1', 'Region 1', 'Region 2', 'Region 2', 'Region 2', 'Region 2', 'Region 3', 'Region 3', 'Region 3', 'Region 3'),
  step2 = c('District A', 'District A', 'District A', 'District B', 'District C', 'District C', 'District D', 'District D', 'District E', 'District E', 'District F', 'District F'),
  step3 = c('1A', '1A', '1B', '1B', '2C', '2D', '2D', '2D', '3E', '3E', '3E', '3F'),
  step4 = c(571,356,765,752,885,73,241,823,196,103,443,197),
  stringsAsFactors = FALSE)

ui <- fluidPage(
  fluidRow(
    column(
      width = 5, offset = 1,
      panel(
        selectizeGroupUI(
          id = "foo",
          params = list(
            Step1 = list(inputId = "step1", title = "Step1:"),
            Step2 = list(inputId = "step2", title = "Step2:"),
            Step3 = list(inputId = "step3", title = "Step3:")
          ))
      ),
      highchartOutput(outputId = "table")
    )
  )
)

server <- function(input, output, session) {

  abc <- callModule(
    module = selectizeGroupServer,
    id = "foo",
    data = DF,
    vars = c("step1", "step2", "step3")
  )

  output$table <- renderHighchart({
    req(abc())
    bar <- aggregate(step4 ~ step1+step2, abc(), sum)
    highchart() %>% hc_add_series(data = bar, type = "column", hcaes(y = step4), showInLegend = TRUE) %>% hc_add_theme(hc_theme_flat())
  })

}

shinyApp(ui, server)

Setting up a conditional group_by

2 Answers2