duckmayr’s answer addresses the posted question. However, I would be
remiss if I did not draw attention to a major error in the construction of
summary_statistics
. Every summary statistic is defined by
myData$<variable>
which means that the grouping function fails to look at
the results within the group (city name). Look at the values in the table -
the same summary statistics are in each and every column.
I expect part of the problem is that prior to version 0.5.0 of qwraps2 the
use of the data pronoun .data
was suggests/needed. However, as of the
release of version 0.5.0 (published on CRAN on 1 September 2020) the
summary_table
method have been refactored to avoid the use of the data
pronoun .data
and to support, but not require, dplyr
.
For example, let’s build and example data set:
set.seed(42)
myData <-
data.frame(city = gl(n = 4, k = 25, labels = c("Eilat", "Jerusalem", "Metula", "TelAviv")),
hobby_hr_week = rpois(n = 100, lambda = 15.54),
work_hr_week = rpois(n = 100, lambda = 30.34),
wellness = rnorm(n = 100, mean = -56.11, sd = 100.01),
RU_happy = rbinom(n = 100, size = 1, p = 0.35))
Load and attach qwraps2
library(qwraps2)
options(qwraps2_markup = "markdown")
packageVersion("qwraps2")
#> [1] '0.5.0'
Defining the summary_table
as in the original question post:
summary_statistics <-
list(
"Hobby(hours/week)" =
list(
"mean (sd)" = ~ qwraps2::mean_sd(myData$hobby_hr_week, na_rm = TRUE),
"min" = ~ min(myData$hobby_hr_week, na.rm = TRUE),
"max" = ~ max(myData$hobby_hr_week, na.rm = TRUE)
),
"Work(hours/week)" =
list(
"mean (sd)" = ~ qwraps2::mean_sd(myData$work_hr_week, na_rm = TRUE),
"min" = ~ min(myData$work_hr_week, na.rm = TRUE),
"max" = ~ max(myData$work_hr_week, na.rm = TRUE)
),
"Wellness" =
list(
"mean (sd)" = ~ qwraps2::mean_sd(myData$wellness, na_rm = TRUE),
"min" = ~ min(myData$wellness, na.rm = TRUE),
"max" = ~ max(myData$wellness, na.rm = TRUE)
),
"Happiness" =
list(
"Happiness" = ~qwraps2::n_perc(myData$RU_happy)
)
)
we’ll get a table, as posted in the original question, but again, the value
are incorrect, they are the value for the whole data set because myData
prefaces each of the variable names.
summary_table(myData, summary_statistics)
#>
#>
#> | |myData (N = 100) |
#> |:----------------------|:---------------------|
#> |**Hobby(hours/week)** | |
#> | mean (sd) |15.76 ± 4.06 |
#> | min |4 |
#> | max |27 |
#> |**Work(hours/week)** | |
#> | mean (sd) |29.40 ± 5.73 |
#> | min |19 |
#> | max |43 |
#> |**Wellness** | |
#> | mean (sd) |-51.84 ± 97.33 |
#> | min |-391.075901351521 |
#> | max |209.058254168693 |
#> |**Happiness** | |
#> | Happiness |34 (34.00%) |
summary_table(myData, summary_statistics, by = "city")
#>
#>
#> | |Eilat (N = 25) |Jerusalem (N = 25) |Metula (N = 25) |TelAviv (N = 25) |
#> |:----------------------|:---------------------|:---------------------|:---------------------|:---------------------|
#> |**Hobby(hours/week)** | | | | |
#> | mean (sd) |15.76 ± 4.06 |15.76 ± 4.06 |15.76 ± 4.06 |15.76 ± 4.06 |
#> | min |4 |4 |4 |4 |
#> | max |27 |27 |27 |27 |
#> |**Work(hours/week)** | | | | |
#> | mean (sd) |29.40 ± 5.73 |29.40 ± 5.73 |29.40 ± 5.73 |29.40 ± 5.73 |
#> | min |19 |19 |19 |19 |
#> | max |43 |43 |43 |43 |
#> |**Wellness** | | | | |
#> | mean (sd) |-51.84 ± 97.33 |-51.84 ± 97.33 |-51.84 ± 97.33 |-51.84 ± 97.33 |
#> | min |-391.075901351521 |-391.075901351521 |-391.075901351521 |-391.075901351521 |
#> | max |209.058254168693 |209.058254168693 |209.058254168693 |209.058254168693 |
#> |**Happiness** | | | | |
#> | Happiness |34 (34.00%) |34 (34.00%) |34 (34.00%) |34 (34.00%) |
To get the correct summary statistics for each city only use the variable
name.
summary_statistics <-
list(
"Hobby(hours/week)" =
list(
"mean (sd)" = ~ qwraps2::mean_sd(hobby_hr_week, na_rm = TRUE),
"min" = ~ min(hobby_hr_week, na.rm = TRUE),
"max" = ~ max(hobby_hr_week, na.rm = TRUE)
),
"Work(hours/week)" =
list(
"mean (sd)" = ~ qwraps2::mean_sd(work_hr_week, na_rm = TRUE),
"min" = ~ min(work_hr_week, na.rm = TRUE),
"max" = ~ max(work_hr_week, na.rm = TRUE)
),
"Wellness" =
list(
"mean (sd)" = ~ qwraps2::mean_sd(wellness, na_rm = TRUE),
"min" = ~ min(wellness, na.rm = TRUE),
"max" = ~ max(wellness, na.rm = TRUE)
),
"Happiness" =
list(
"Happiness" = ~qwraps2::n_perc(RU_happy)
)
)
And the updated table:
print(
summary_table(myData, summary_statistics, by = "city"),
rtitle = "Summary Statistics Table for the Wellness Data Set"
)
#>
#>
#> |Summary Statistics Table for the Wellness Data Set |Eilat (N = 25) |Jerusalem (N = 25) |Metula (N = 25) |TelAviv (N = 25) |
#> |:--------------------------------------------------|:---------------------|:----------------------|:---------------------|:---------------------|
#> |**Hobby(hours/week)** | | | | |
#> | mean (sd) |16.20 ± 4.47 |15.96 ± 4.59 |15.64 ± 3.68 |15.24 ± 3.60 |
#> | min |8 |4 |10 |11 |
#> | max |24 |27 |22 |24 |
#> |**Work(hours/week)** | | | | |
#> | mean (sd) |29.36 ± 6.75 |31.24 ± 5.06 |28.72 ± 5.78 |28.28 ± 5.05 |
#> | min |20 |20 |19 |21 |
#> | max |43 |41 |40 |38 |
#> |**Wellness** | | | | |
#> | mean (sd) |-84.07 ± 94.75 |-51.45 ± 124.61 |-36.56 ± 77.73 |-35.28 ± 83.20 |
#> | min |-232.587140717081 |-391.075901351521 |-184.147379739545 |-172.159781696125 |
#> | max |120.55715927506 |209.058254168693 |113.344403784632 |113.643322269377 |
#> |**Happiness** | | | | |
#> | Happiness |9 (36.00%) |10 (40.00%) |9 (36.00%) |6 (24.00%) |
Created on 2020-09-01 by the reprex package (v0.3.0)

devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.2 (2020-06-22)
#> os macOS Catalina 10.15.6
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/Denver
#> date 2020-09-01
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0)
#> backports 1.1.9 2020-08-24 [1] CRAN (R 4.0.2)
#> callr 3.4.3 2020-03-28 [1] CRAN (R 4.0.0)
#> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0)
#> devtools 2.3.1 2020-07-21 [1] CRAN (R 4.0.2)
#> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0)
#> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
#> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0)
#> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.0)
#> knitr 1.29 2020-06-23 [1] CRAN (R 4.0.0)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.0)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0)
#> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2)
#> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.0)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0)
#> processx 3.4.3 2020-07-05 [1] CRAN (R 4.0.0)
#> ps 1.3.4 2020-08-11 [1] CRAN (R 4.0.2)
#> qwraps2 * 0.5.0 2020-08-31 [1] local
#> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0)
#> Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.0)
#> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2)
#> rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2)
#> rmarkdown 2.3 2020-06-18 [1] CRAN (R 4.0.0)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0)
#> stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.0)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0)
#> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.0)
#> usethis 1.6.1 2020-04-29 [1] CRAN (R 4.0.0)
#> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.0)
#> xfun 0.16 2020-07-24 [1] CRAN (R 4.0.2)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library