1

I'm using one of the R built-in datasets called UCBAdmissions and trying to create a grouped barplot with the data coerced to a dataframe and grouped by Admit, Gender and Dept (without using ggplot).

data(UCBAdmissions)
as.data.frame(UCBAdmissions)
      Admit Gender Dept Freq
1  Admitted   Male    A  512
2  Rejected   Male    A  313
3  Admitted Female    A   89
4  Rejected Female    A   19
5  Admitted   Male    B  353
6  Rejected   Male    B  207
7  Admitted Female    B   17
8  Rejected Female    B    8
9  Admitted   Male    C  120
10 Rejected   Male    C  205
11 Admitted Female    C  202
12 Rejected Female    C  391
13 Admitted   Male    D  138
14 Rejected   Male    D  279
15 Admitted Female    D  131
16 Rejected Female    D  244
17 Admitted   Male    E   53
18 Rejected   Male    E  138
19 Admitted Female    E   94
20 Rejected Female    E  299
21 Admitted   Male    F   22
22 Rejected   Male    F  351
23 Admitted Female    F   24
24 Rejected Female    F  317

I tried converting the data to the table format this way but got an error message.

> barplot(table(as.data.frame(UCBAdmissions)))
Error in barplot.default(table(as.data.frame(UCBAdmissions))) : 
  'height' must be a vector or a matrix

I found this SO link that provided a non-ggplot answer but was getting the error message shown above.

There is also this SO link but the data is structured differently.

I'm hoping the data can be displayed with just two dimensions. Here is what a simplified grouped barplot looks like.

grouped barplot

pmagunia
  • 1,718
  • 1
  • 22
  • 33

1 Answers1

2

I'm not exactly sure what you're trying to achieve, but I'll assume you want bars grouped by Dept, and the legend to be a combination of Gender & Admit (just to give the idea).

In the barplot examples you point to, the data is a pure numeric matrix with rownames and colnames set to the labels and groupings. You'll need to start by transforming your data (I use dplyr and tidyr from the tidyverse):

library(tidyverse)
df2 = group_by(as.data.frame(UCBAdmissions), Dept, Gender, Admit) %>% 
    summarise(Freq = sum(Freq)) %>%
    ungroup() %>%
    mutate(GA = paste(Gender, Admit)) %>%
    select(Dept, GA, Freq) %>%
    spread(key = Dept, value = Freq) %>%
    as.data.frame()
rownames(df2) = df2$GA
df2 = as.matrix(select(df2, -GA))

Now your data is in a form which barplot can use:

barplot(df2, beside=TRUE, legend = rownames(df2))

final bar plot

lebelinoz
  • 4,890
  • 10
  • 33
  • 56