0

I am trying to plot a series of variables, which are collected in two-time frames. The structure of data is something like this, the number of observations is 9700, the class is factor. Please see the structure of the data

enter image description here

I want to plot a barplot like thisI will have a list of the sbs base on each wave.

enter image description here I have used aggregate function and dplyr, but I could not make a proper structure for the data.

I am very happy that can you help me with it.

Thank you,

Tung
  • 26,371
  • 7
  • 91
  • 115
Ali Roghani
  • 495
  • 2
  • 7
  • These might be useful https://stackoverflow.com/questions/64146274/how-to-use-geom-bar-to-create-two-grouped-columns-in-r & https://ggplot2.tidyverse.org/reference/position_dodge.html – Tung Nov 15 '20 at 05:53

1 Answers1

0

As @Tung suggested, you can put your data into long format, and use position_dodge with the plot so bars are next to each other in the plot. Here is an example.

Using tidyr pivot_longer you can put columns that start with "sb" into long form. Then you can filter out rows where the value is zero. unite will combine names - such as sb_1 and x to become sb_1_x.

In this format, it is easier to plot. Use geom_bar to create the bar plot, and use position_dodge2 to put bars next to each other with different wave values. The use of preserve = "single" keeps the bars the same width (in cases where one wave has zero count).

library(tidyverse)
library(ggplot2)

df %>%
  pivot_longer(cols = starts_with("sb")) %>%
  filter(value != 0) %>%
  unite(sb, name, value) %>%
  ggplot(aes(x = sb)) +
    geom_bar(aes(fill = wave), position = position_dodge2(preserve = "single"))

Plot

bar plot

Ben
  • 28,684
  • 5
  • 23
  • 45
  • You are the best . – Ali Roghani Nov 17 '20 at 06:12
  • Could you help me with this. https://stackoverflow.com/questions/64884680/ordering-x-axis-using-ggplot . The order is changed if my variable is more than 10. Thank you – Ali Roghani Nov 17 '20 at 23:08
  • Could you please let me know, what function I should use if I want to have percentages than counts? It means for each bar I need the (number of x / total sample in each wave). – Ali Roghani Nov 21 '20 at 00:11
  • ```new_data %>% group_by(wave) %>% mutate(total.wave= sum(count)) %>% group_by(total.wave, sb) %>% mutate(per=paste0(round(100*count/total.wave,2),'%')) pivot_longer(cols = starts_with("sb_")) %>% filter(value != 0) %>% unite(sb_,name, value) %>% ggplot(aes(x = sb_)) + geom_bar(aes(fill = wave), position = position_dodge2(preserve = "single")))``` and I got this error: Error: Problem with `mutate()` input `total.wave`. x invalid 'type' (closure) of argument i Input `total.wave` is `sum(count)`. i The error occurred in group 1: wave = "first". – Ali Roghani Nov 21 '20 at 00:52
  • Thank you, I did that, and bars do not appear in the plot. I have made a new question https://stackoverflow.com/questions/64939338/plotting-series-of-factor-variables-side-by-side-and-have-the-percentage-on-y-ax Please check that and I will upload the picture there. – Ali Roghani Nov 21 '20 at 03:18