0

What I currenly have: Plot

What I want to achive: Expectations

The csv file looks like this: CSV file

EDIT:structure form:

| Lp. | P_1_1 | P_1_2 | P_1_3 | P_1_4 | P_1_5 | P_1_6 | P_2_1 | P_2_2 | P_3_1 | P_3_2 | P_4_1 | P_4_2 | P_4_3 | P_4_4 | Wyksztalcenie |
|-----|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|---------------|
|  1  |   3   |   4   |   4   |   4   |   3   |   3   |   1   |   7   |   7   |   7   |   1   |   3   |   5   |   7   |     wyzsze    |
|  2  |   3   |   1   |   1   |   5   |   3   |   3   |   1   |   4   |   6   |   4   |   1   |   5   |   3   |   4   |    srednie    |
|  3  |   3   |   2   |   3   |   4   |   2   |   7   |   1   |   6   |   6   |   6   |   5   |   3   |   4   |   4   |     wyzsze    |
|  4  |   3   |   3   |   4   |   4   |   3   |   4   |   1   |   5   |   6   |   6   |   3   |   3   |   5   |   5   |    srednie    |
|  5  |   3   |   1   |   4   |   7   |   3   |   3   |   3   |   6   |   5   |   7   |   5   |   5   |   2   |   2   |    srednie    |
|  6  |   3   |   4   |   1   |   4   |   3   |   3   |   1   |   6   |   7   |   7   |   7   |   4   |   7   |   5   |    srednie    |

CSV data after change: CSV after change

Changing the csv data for the chart:

dane <- read.csv2("ankieta.csv", header = TRUE, sep = ";")
temp <- dane %>% count(P_3_1, Wyksztalcenie)
temp$pytanie <- "P_3_1"
colnames(temp) <- c("Lp", "Wykształcenie", "Liczebność", "Pytanie")
temp2 <- dane %>% count(P_3_2, Wyksztalcenie)
temp2$pytanie <- "P_3_2"
colnames(temp2) <- c("Lp", "Wykształcenie", "Liczebność", "Pytanie")
df_merge <- rbind(temp, temp2)

Data structure after change:

| Lp | Wykształcenie | Liczebność | Pytanie |
|----|---------------|------------|---------|   
| 1  | podstawowe    |       1    |  P_3_1  |
| 1  |    srednie    |      52    |  P_3_1  |
| 1  |     wyzsze    |      65    |  P_3_1  |
| 1  |   zawodowe    |      11    |  P_3_1  |
| 2  | podstawowe    |       1    |  P_3_1  |
| 2  |    srednie    |      45    |  P_3_1  |

Is it possible to prepare the data more simply? It seems to me that I achieved this effect very inelegantly

ggplot2 code:

p1 <- 
  ggplot(data = df_merge, aes(x = Lp, y = Liczebność, color = Pytanie))+ 
  geom_bar(stat = "identity")+
  facet_grid(. ~ Wykształcenie)+
  labs(x = "Wykształcenie", 
       y = "Liczebność", 
       title = "Rozkład odpowiedzi na pytania grupy 'P_3', w podziale na wykształcenie:")
p1 + scale_x_discrete(name ="Wykształcenie", 
                      limits=c("1","2","3","4","5","6","7"))

Totally don't know:

  • How to change the order in which the chart is displayed in order (podstawowe, zawodowe, srednie, wyzsze)
  • How to change the size of the chart (grid), display the chart in 2 columns
  • How to fill in the whole bar (bar can not be gray)
Ragg47
  • 1
  • 1
  • 1
    Please don't post data as images. Take a look at how to make a [great reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for ways of showing data. The gold standard for providing data is using `dput(head(NameOfYourData))`, *editing* your question and putting the `structure()` output into the question. – Martin Gal Aug 28 '21 at 16:23
  • Regarding your question: This could be most likely be done using `dplyr`. I bet there will be some answers for your question once you provide an example of your data in `structure()`-form. – Martin Gal Aug 28 '21 at 16:25
  • To change the order, you have to reorder the levels of the factor to the order you want (see fct_relevel). To display in 2 cols, see `ncol` argument in `facet_wrap`. To fill the bar, use the `fill` argument in `geom_bar` – Brigadeiro Aug 28 '21 at 16:31
  • @MartinGal I inserted structure() output into the question. – Ragg47 Aug 28 '21 at 17:59
  • 1. For reading data from a csv file, I recommend the functions of the `readr` package. 2. After loading, replace the column with education with a `factor`. I recommend the `forcats` package here. 3. Further already the transformation into `dplyr` 4. Finally, the `ggplot2` visualization How good it is that you have everything together in the `tidyverse` package! P.S. Place data, not pictures !! P.S. 2 You have a lot of those with higher education :-) – Marek Fiołka Aug 28 '21 at 18:08
  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – Community Aug 28 '21 at 22:19

1 Answers1

0
  1. Using dplyr, you could use arrange(`Lp.`, Pytanie, match(Wykształcenie, c("podstawowe", "zawodowe", "srednie" ,"wyzsze"))) to sort your data by Lp., Pytanie and Wykształcenie, the latter in the custom order.
  2. You could use facet_wrap(. ~ Wykształcenie, nrow = 2) + instead of facet_grid().
  3. You could use fill instead of color. This gives you colored bars instead of the grey bars with colored borders.

So based on your data shown:

library(dplyr)
library(tidyr)
library(ggplot2)

df %>% 
  pivot_longer(-c(`Lp.`, Wyksztalcenie),
               names_to = "Pytanie",
               values_to = "Liczebność") %>% 
#  Use filter to filter for specific data
#  filter(`Lp.` == "3", Pytanie == "P_3_1") %>% 
#  or for P_3
  filter(grepl("P_3", Pytanie)) %>% 
#  Sort the data for specific columns
  arrange(`Lp.`, Pytanie, match(Wykształcenie, c("podstawowe", "zawodowe", "srednie" ,"wyzsze"))) %>%
  ggplot(aes(x = `Lp.`, y = Liczebność, fill = Pytanie)) + 
  geom_bar(stat = "identity") +
  facet_wrap(. ~ Wykształcenie, nrow = 2) +
  labs(x = "Wykształcenie", 
       y = "Liczebność", 
       title = "Rozkład odpowiedzi na pytania grupy 'P_3', w podziale na wykształcenie:") + 
  scale_x_discrete(name ="Wykształcenie", 
                   limits=c("1","2","3","4","5","6","7"))

returns (without any filtering)

enter image description here

or (with filter(grepl("P_3", Pytanie)))

enter image description here

Martin Gal
  • 16,640
  • 5
  • 21
  • 39