0

I would like to make a bar plot, where each bar is represented by one of the three columns in this data frame. The 'size' of each bar depends on the sum created by adorn_totals.

Reproducible example:

library(janitor)

test_df <- data.frame(
  a = c(1:5),
  b = c(1:5),
  c = c(1:5)
  ) %>% 
  adorn_totals(where = 'row', tabyl = c(a, b, c))

I tried a solution that has previously been posted, but that didn't work: Link to the post: Bar plot for each column in a data frame in R

library(janitor)
library(ggplot2)

df <- data.frame(
  a = c(1:5),
  b = c(1:5),
  c = c(1:5)
  ) %>% 
  adorn_totals(where = 'row', tabyl = c(a, b, c))

lapply(names(df), function(col) {
  ggplot(df, aes(.data[[col]], ..count..)) + 
    geom_bar(aes(fill = .data[[col]]), position = "dodge")
}) -> list_plots
Novice
  • 15
  • 2

3 Answers3

3

This is one way:

library(janitor)
library(ggplot2)

test_df <- data.frame(
  a = c(1:5),
  b = c(1:5),
  c = c(1:5)
  ) %>% 
  adorn_totals(where = 'row', tabyl = c(a, b, c))

tail(test_df,1) %>% stack() %>% 
  ggplot(aes(ind, values)) + geom_col()

Created on 2022-11-07 with reprex v2.0.2

Of course, you don't need to totalize the df before plotting it, since ggplot does it for you. I add another example with an explanation of stack, some color, and no totals.

library(ggplot2)

test_df <- data.frame(
  a = c(1:5),
  b = c(1:5),
  c = c(1:5))

test_df |> stack()
#>    values ind
#> 1       1   a
#> 2       2   a
#> 3       3   a
#> 4       4   a
#> 5       5   a
#> 6       1   b
#> 7       2   b
#> 8       3   b
#> 9       4   b
#> 10      5   b
#> 11      1   c
#> 12      2   c
#> 13      3   c
#> 14      4   c
#> 15      5   c

test_df |> stack() |> 
  ggplot(aes(ind, values, fill=ind)) + geom_col()

Created on 2022-11-07 with reprex v2.0.2

Ric
  • 5,362
  • 1
  • 10
  • 23
  • When I use `geom_col()` in your second example, there is slight horizontal lines from barely visible stacked bars of the components. They go away when both fill and color are defined. weird – M.Viking Nov 08 '22 at 02:48
  • Yes, perhaps geom_col is not best option. It is better to use geom_bar(stat="sum") but it adds a weird legend that can be wiped out with param show.legend=F – Ric Nov 08 '22 at 03:07
1

If you want to use ggplot, you would be best to slice the totals off the bottom, pivot into long format and plot the result:

library(janitor)
library(tidyverse)

data.frame(
  a = c(1:5),
  b = c(1:5),
  c = c(1:5)
) %>% 
  adorn_totals(where = 'row', tabyl = c(a, b, c)) %>%
  slice_tail(n = 1) %>%
  pivot_longer(everything()) %>%
  ggplot(aes(name, value, fill = name)) +
  geom_col(color = "gray") +
  scale_fill_brewer() +
  theme_minimal(base_size = 16)

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
0

Two pivot_longer alternatives without janitor::adorn_totals()

#uses the internal weight stat to calculate the sum
#geom_bar only uses one aesthetic (x OR y)
data.frame(a = c(1:5), b = c(1:5), c = c(1:5)) %>% 
  pivot_longer(everything()) %>% 
  ggplot(aes(name, weight=value))+
  geom_bar()


#geom_col version
#Lots of flexibility in summarise:
data.frame(a = c(1:5), b = c(1:5), c = c(1:5)) %>% 
  pivot_longer(everything()) %>% 
  group_by(name) %>% 
  summarise(total=sum(value)) %>% 
  ggplot(aes(name, total))+
  geom_col()
M.Viking
  • 5,067
  • 4
  • 17
  • 33