3

I have been reading about two headers table here and here with expss package, but the online code didn't work for me. My idea is to create a very similar table to this image:

enter image description here

The dataframe is:

df <- data.frame(Categoria = c("gender", "gender" , "gender", "gender", "gender", "gender", 
                                 "religion", "religion", "religion", "religion", "religion",
                                 "religion", "religion", "religion", "religion", "religion", 
                                 "religion", "religion"),
                 Opcoes_da_categoria = c("Mulher", "Homem", "Mulher", "Homem", "Mulher", 
                                           "Homem", "Outra religião", "Católico", "Agnóstico ou ateu",
                                           "Evangélico", "Outra religião", "Católico", 
                                           "Agnóstico ou ateu", "Evangélico", "Outra religião",
                                           "Católico", "Agnóstico ou ateu", "Evangélico"),
                 Resposta = c("A Favor", "A Favor", "Contra",  "Contra",  "Não sei", "Não sei",
                              "A Favor", "A Favor", "A Favor", "A Favor", "Contra", "Contra",
                              "Contra", "Contra", "Não sei", "Não sei", "Não sei", "Não sei"),
                 value_perc = c(65, 50, 33, 43, 2, 7, 67, 64, 56, 28, 31, 34, 35, 66, 2, 2, 10, 5))

My code to create the two headers table is below, but it didn't work properly because of the following problems:

  • The table should have two headers
  • The columns' name should not appear in the table
  • The value is not supposed to have decimal cases
library(expss)

my_table <- df %>%
  tab_cells(Resposta) %>%
  tab_weight(value_perc) %>% 
  tab_cols(Opcoes_da_categoria, Categoria) %>%
  tab_stat_cpct(total_label = NULL) %>%
  tab_pivot()

library(gridExtra)

png("my_table.png", height = 50*nrow(my_table), width = 200*ncol(my_table))
grid.table(my_table)
dev.off()
  

enter image description here

Henrik
  • 65,555
  • 14
  • 143
  • 159
polo
  • 185
  • 1
  • 3
  • 11
  • Not familiar with `expss` but this can be done with `knitr::kable()` and `kableExtra`. I do not know the exact style you want, but it is another option: [vignette here](https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html#grouped_columns__rows) – Andrew Jun 22 '20 at 14:46
  • I tryed with knitr::kable() and kableExtra too, but it also didn't work for me. It wouldn't be a problem to use these packages instead of expss – polo Jun 22 '20 at 14:50
  • @polo I recently developed a package that may automatically do something similar to what you are trying to achieve. The output is a bit different than your image, but you might want to check it out [here](https://github.com/DanChaltiel/crosstable). – Dan Chaltiel Jun 28 '20 at 15:33
  • thank you, @DanChaltiel – polo Jun 29 '20 at 15:17

3 Answers3

2

I don't know expssbut have used flextable recently and found it nice. Being far from an expert in it, I managed to make a good looking table which comes close to what you want. Starting from your DF some changes have to be made, to bring the DF in the format needed for your table. Renaming the col-names follows, by extracting the part of the name before _. A DF typology describing the dependencies of col and header-names is built. (Can be found in the link above). Then the flextable part comes, which builds a flextable first and then applies typology and other formating commands.

What comes out of this, shows the attached picture.


library(tidyverse)
library(flextable)
#> 
#> Attache Paket: 'flextable'
#> The following object is masked from 'package:purrr':
#> 
#>     compose
df <- data.frame(
  Categoria = c(
    "gender", "gender", "gender", "gender", "gender", "gender",
    "religion", "religion", "religion", "religion", "religion",
    "religion", "religion", "religion", "religion", "religion",
    "religion", "religion"
  ),
  Opcoes_da_categoria = c(
    "Mulher", "Homem", "Mulher", "Homem", "Mulher",
    "Homem", "Outra religião", "Católico", "Agnóstico ou ateu",
    "Evangélico", "Outra religião", "Católico",
    "Agnóstico ou ateu", "Evangélico", "Outra religião",
    "Católico", "Agnóstico ou ateu", "Evangélico"
  ),
  Resposta = c(
    "A Favor", "A Favor", "Contra", "Contra", "Não sei", "Não sei",
    "A Favor", "A Favor", "A Favor", "A Favor", "Contra", "Contra",
    "Contra", "Contra", "Não sei", "Não sei", "Não sei", "Não sei"
  ),
  value_perc = c(65, 50, 33, 43, 2, 7, 67, 64, 56, 28, 31, 34, 35, 66, 2, 2, 10, 5)
)


# adjust your df to match cols and names with tidyvers
dfa <- df %>%
  pivot_wider(names_from =c('Opcoes_da_categoria', 'Categoria'), values_from = 'value_perc')
nam <- str_extract(colnames(dfa),'^[^_]+')
colnames(dfa) <- nam

typology <- data.frame(
  col_keys = c( "Resposta",
                "Mulher", "Homem",
                "Outra religião", "Católico",
                "Agnóstico ou ateu", "Evangélico"),
  what = c("", "Genero", "Genero", "Religio",
           "Religio", "Religio", 'Religio'),
  measure = c( "Resposta", 
               "Mulher", "Homem",
               "Outra religião", "Católico",
               "Agnóstico ou ateu", "Evangélico"),
  stringsAsFactors = FALSE )

library(officer) # needed for making border
dftab <- flextable::flextable(dfa)

border_v = fp_border(color="gray")
dftab <- dftab %>% 
  set_header_df(mapping = typology, key = "col_keys" ) %>% 
  merge_h(part = "header") %>% 
  merge_v(part = "header") %>% 
  theme_booktabs() %>% 
  vline(border = border_v, j =3, part = 'body') %>% 
  vline(border = border_v, j =3, part = 'header')
print(dftab)
#> a flextable object.
#> col_keys: `Resposta`, `Mulher`, `Homem`, `Outra religião`, `Católico`, `Agnóstico ou ateu`, `Evangélico` 
#> header has 2 row(s) 
#> body has 3 row(s) 
#> original dataset sample: 
#>   Resposta Mulher Homem Outra religião Católico Agnóstico ou ateu Evangélico
#> 1  A Favor     65    50             67       64                56         28
#> 2   Contra     33    43             31       34                35         66
#> 3  Não sei      2     7              2        2                10          5

enter image description here

MarBlo
  • 4,195
  • 1
  • 13
  • 27
1

Here is a flexible kable solution that should adapt to different tables as long as you can get the data into wide format. Hope it helps--let me know if you have questions!

library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

df_wide <- df %>% # transform data to wide format, "drop" name for Resposta
  pivot_wider(names_from = c(Categoria, Opcoes_da_categoria), 
              values_from = value_perc, names_sep = "_") %>%
  rename(" " = Resposta)

cols <- sub("(.*?)_(.*)", "\\2", names(df_wide)) # grab everything after the _
grps <- sub("(.*?)_(.*)", "\\1", names(df_wide)) # grab everything before the _

df_wide %>%
  kable(col.names = cols) %>% 
  kable_styling(c("striped"), full_width = FALSE) %>% # check out ?kable_styling for other options
  add_header_above(table(grps)[unique(grps)]) # unique makes sure it is the correct order
Andrew
  • 5,028
  • 2
  • 11
  • 21
0

You try to view table in the RStudio Data Viewer. It shows expss tables as usual data.frames.

You can view expss tables in the RStudio Viewer (not Data Viewer) by setting expss_output_viewer():

df <- data.frame(Categoria = c("gender", "gender" , "gender", "gender", "gender", "gender", 
                               "religion", "religion", "religion", "religion", "religion",
                               "religion", "religion", "religion", "religion", "religion", 
                               "religion", "religion"),
                 Opcoes_da_categoria = c("Mulher", "Homem", "Mulher", "Homem", "Mulher", 
                                         "Homem", "Outra religião", "Católico", "Agnóstico ou ateu",
                                         "Evangélico", "Outra religião", "Católico", 
                                         "Agnóstico ou ateu", "Evangélico", "Outra religião",
                                         "Católico", "Agnóstico ou ateu", "Evangélico"),
                 Resposta = c("A Favor", "A Favor", "Contra",  "Contra",  "Não sei", "Não sei",
                              "A Favor", "A Favor", "A Favor", "A Favor", "Contra", "Contra",
                              "Contra", "Contra", "Não sei", "Não sei", "Não sei", "Não sei"),
                 value_perc = c(65, 50, 33, 43, 2, 7, 67, 64, 56, 28, 31, 34, 35, 66, 2, 2, 10, 5))

library(expss)

my_table <- df %>%
    tab_cells(Resposta) %>%
    tab_weight(value_perc) %>% 
    tab_cols(Opcoes_da_categoria, Categoria) %>%
    tab_stat_cpct(total_label = NULL) %>%
    tab_pivot()

expss_digits(0) # turn off decimal digits
expss_output_viewer() # turn on displaying tables in the viewer
my_table

expss_output_default() # turn off displaying tables in the viewer

This code gives the following result: enter image description here

If you really want to display the table in the data viewer you can convert table to the usual data.frame. There is a special command for that - split_table_to_df:

View(split_table_to_df(my_table))

With the result: enter image description here

UPDATE:

df <- data.frame(Categoria = c("gender", "gender" , "gender", "gender", "gender", "gender", 
                               "religion", "religion", "religion", "religion", "religion",
                               "religion", "religion", "religion", "religion", "religion", 
                               "religion", "religion"),
                 Opcoes_da_categoria = c("Mulher", "Homem", "Mulher", "Homem", "Mulher", 
                                         "Homem", "Outra religião", "Católico", "Agnóstico ou ateu",
                                         "Evangélico", "Outra religião", "Católico", 
                                         "Agnóstico ou ateu", "Evangélico", "Outra religião",
                                         "Católico", "Agnóstico ou ateu", "Evangélico"),
                 Resposta = c("A Favor", "A Favor", "Contra",  "Contra",  "Não sei", "Não sei",
                              "A Favor", "A Favor", "A Favor", "A Favor", "Contra", "Contra",
                              "Contra", "Contra", "Não sei", "Não sei", "Não sei", "Não sei"),
                 value_perc = c(65, 50, 33, 43, 2, 7, 67, 64, 56, 28, 31, 34, 35, 66, 2, 2, 10, 5))

library(expss)

my_table <- df %>%
    apply_labels(
        Resposta = "",
        Opcoes_da_categoria = "",
        Categoria = ""
    ) %>% 
    tab_cells(Resposta) %>%
    tab_weight(value_perc) %>% 
    tab_cols(Categoria, Opcoes_da_categoria) %>%
    tab_stat_cpct(total_row_position = "none") %>%
    tab_pivot()

expss_digits(0) # turn off decimal digits
View(my_table)

enter image description here

Gregory Demin
  • 4,596
  • 2
  • 20
  • 20
  • thank for the answer, Gregory Demin, but my issue is that the columns' name (Opcoes_da_categoria and Categoria) should not appear in the table. The table should have two headers (Text of Categoria's column and then text of Opcoes_da_categoria). So "gender" and "religion" should come first... And how can I delete "#Total cases" row? – polo Jun 23 '20 at 11:38