Trying to make a table in R where I group columns by variables of a vector

Question

I'm trying to generate a neat table using the kableExtra / gt packages (or anything that works), with a goal of having my value columns grouped by the different Variables:

data <- data.frame(Name = c("Mary","Mary","Mary","Jane","Jane","Jane"),
                   Variables = c(letters[1:3],letters[1:3]),
                   Count = c(45, 76, 43, 23, 11, 46),
                   Percent = c(0.45, 0.56, 0.89, 0.65, 0.88, 0.91)) %>% arrange(Name, Variables)

# Desired output:
#             a              b               c
#       Count Percent | Count Percent | Count Percent
# Mary   45    45%       76    56%       43    89%
# Jane   23    65%       11    88%       46    91%

I can't seem to figure out how to go this, and the closest I can come is:

library(gt)
gt(data, rowname_col = "Variables", groupname_col = "Name")

output

I'm taking inspiration that this can be done from the graph shown in this page: table with output columns grouped by year variable

Thanks for your help!

UPDATE: Solved based on post from Stefan:

data <- data.frame(Name = c("Mary","Mary","Mary","Jane","Jane","Jane"),
                   Variables = c(letters[1:3],letters[1:3]),
                   Count = c(45, 76, 43, 23, 11, 46),
                   Percent = c(0.45, 0.56, 0.89, 0.65, 0.88, 0.91)) %>% 
        arrange(desc(Name), Variables)

# Helper to put the columns in the right order
cols_order <- unlist(lapply(c("a", "b", "c"), function(x) paste(x, c("Count", "Percent"), sep = "_")))

data_wide <- data %>% 
        pivot_wider(names_from = "Variables", values_from = c(Count, Percent), names_glue = "{Variables}_{.value}") %>% 
        # Reorder columns
        select(all_of(c("Name", cols_order)))

data_wide %>% 
        gt(rowname_col = "Name") %>%
        tab_spanner_delim(delim = "_") %>%
        fmt_percent(ends_with("Percent"), decimals = 0)

stefan · Accepted Answer · 2021-01-09T16:53:55.613

To achieve your desired result you could first reshape your data to wide format using e.g. tidy::pivot_wider. Next step is to put the columns in the right order. To this end I reorder the cols of the df but this could probably also be done via gt. The remainder is styling the table. To group by Variables you could make use of tab_spanner_delim and get nicely formatted percentages via fmt_percent:

EDIT Thanks to @Shoesoff for pointing out that my original solution could be simplified considerably by making use of tab_spanner_delim instead of tab_spanner.

Improved answer

library(gt)
library(tidyr)
library(dplyr)

data <- data.frame(Name = c("Mary","Mary","Mary","Jane","Jane","Jane"),
                   Variables = c(letters[1:3],letters[1:3]),
                   Count = c(45, 76, 43, 23, 11, 46),
                   Percent = c(0.45, 0.56, 0.89, 0.65, 0.88, 0.91)) %>% 
  arrange(desc(Name), Variables)

# Helper to put the columns in the right order
cols_order <- unlist(lapply(c("a", "b", "c"), function(x) paste(x, c("Count", "Percent"), sep = "_")))

data_wide <- data %>% 
  pivot_wider(names_from = "Variables", values_from = c(Count, Percent), names_glue = "{Variables}_{.value}") %>% 
  # Reorder columns
  select(all_of(c("Name", cols_order)))

data_wide %>% 
  gt(data, rowname_col = "Name") %>% 
  tab_spanner_delim(
    delim = "_"
  ) %>% 
  fmt_percent(ends_with("Percent"), decimals = 0)

Hi Stefan, based on your code I changed a small part at the end that should make this more scalable for a larger dataset with more name variables, and thought I'd share it here: ``` data_wide %>% gt(rowname_col = "Name") %>% tab_spanner_delim(delim = "_") %>% fmt_percent(ends_with("Percent"), decimals = 0) ``` — Shoesoff, Jan 09 '21 at 16:23
Hi Shoesoff. Thanks for sharing. This is reduces the code considerably. I will make an edit to my answer so that it's more useful for others. Best S. — stefan, Jan 09 '21 at 16:48

score 0 · Answer 2 · edited Jan 09 '21 at 13:01

Get the result

data <- data.frame(Name = c("Mary","Mary","Mary","Jane","Jane","Jane"),
                   Variables = c(letters[1:3],letters[1:3]),
                   Count = c(45, 76, 43, 23, 11, 46),
                   Percent = c(0.45, 0.56, 0.89, 0.65, 0.88, 0.91))

v1 <- xtabs(cbind(Count, Percent) ~ Name + Variables, data)

Reorganise the data:

Mary <- apply(v1[2,,], 1, c)
dim(Mary) <- NULL
Jane <- apply(v1[1,,], 1, c)
dim(Jane) <- NULL

Create the table with kable and add an extra header with kableExtra.

require(knitr)
require(kableExtra)
add_header_above(kable(rbind(Mary, Jane), col.names = rep(c("Count", "Percent"), 3)),
                 c("", "a" = 2, "b" = 2, "c" = 2))

The rest is a little formatting to add more space and format the percentages...

^{Created on 2021-01-09 by the reprex package (v0.3.0)}

Thanks, this works perfectly for this dataset! – Shoesoff Jan 09 '21 at 15:36 — Shoesoff, Jan 09 '21 at 15:36

Trying to make a table in R where I group columns by variables of a vector

2 Answers2

Linked