-1

I'm currently working on creating a graph that would display 3 means for each of the 2x2x2 factorial design groups.

Here's a simple example of my data in R:

####Reproducible Example

set.seed(44)

n <- 48
Condition <- c("Exp", "Control")
Sex <- c("Male", "Female")
Ideology <- c("Conservative", "Ideology")


dat <- data.frame(id = 1:n,
                  tidyr::crossing(Condition, Sex, Ideology),
                  Shoe_Size = sample(1:7, n, replace = TRUE),
                  Hat_Size = sample(1:7, n, replace = TRUE),
                  Glove_Size = sample(1:7, n, replace = TRUE))

> head(dat)
  id Condition    Sex     Ideology Shoe_Size Hat_Size Glove_Size
1  1   Control Female Conservative         1        1          6
2  2   Control Female     Ideology         3        5          3
3  3   Control   Male Conservative         3        5          4
4  4   Control   Male     Ideology         1        2          1
5  5       Exp Female Conservative         6        2          6
6  6       Exp Female     Ideology         4        3          7

My goal is to create a graph like this:

Graph that I'm after

Without doing it manually like this:

####Graph Example

library(ggplot2)

dat.graph <- data.frame(Condition = c("Exp", "Exp", "Exp", "Exp", "Control", "Control", "Control", "Control",
                              "Exp", "Exp", "Exp", "Exp", "Control", "Control", "Control", "Control",
                              "Exp", "Exp", "Exp", "Exp", "Control", "Control", "Control", "Control"),
                      Sex = c("Male", "Male", "Female", "Female", "Male", "Male", "Female", "Female",
                              "Male", "Male", "Female", "Female", "Male", "Male", "Female", "Female",
                              "Male", "Male", "Female", "Female", "Male", "Male", "Female", "Female"),
                      Ideology = c("Conservative", "Liberal","Conservative", "Liberal","Conservative", "Liberal","Conservative", "Liberal",
                                   "Conservative", "Liberal","Conservative", "Liberal","Conservative", "Liberal","Conservative", "Liberal",
                                   "Conservative", "Liberal","Conservative", "Liberal","Conservative", "Liberal","Conservative", "Liberal"),
                      Clothes = c("Shoes","Shoes","Shoes","Shoes","Shoes","Shoes","Shoes","Shoes",
                                  "Hats","Hats","Hats","Hats","Hats","Hats","Hats","Hats",
                                  "Gloves","Gloves","Gloves","Gloves","Gloves","Gloves","Gloves","Gloves"),
                      Mean_Size = c(3.16, 2.5, 2.1, 7, 5.1, 2.9, 2.1, 6.4,
                               2.63, 3.1, 3.61, 4.4, 3.7, 2.1, 1.2, 2.7, 
                               5.7, 3.2, 2.1, 2.6, 3.1, 6.2, 2.1, 2.6))

ggplot(data = dat.graph, aes(x = Clothes, y = Mean_Size, fill = Ideology)) +
  geom_bar(stat = "identity", width = .5, position = "dodge") +
  facet_wrap(~ Condition + Sex, ncol = 6, drop = FALSE) +
  theme(
    axis.text.x = element_text(angle=50, hjust=1)
  )

Note: For the sake of time, the Mean_Size values are not the actual means from the dat data frame. To get these I'd have to use the aggregate()function for each of the numeric variables.

My theory is that to do that, I'd need to transform these three separate numeric variables (Shoe_Size, Hat_Size, and Glove_Size) into three levels of one variable (Clothes) so that I can put that one variable on the x axis of my graph as I've done it "manually".

My questions (and problems) are:

  1. Is my theory correct?
  2. Is it possible to create a graph like the one presented above without doing it manually?

It is my second attempt trying to explain my question as clearly as I can at this stage of my learning so my apologies if anything is not 100% clear yet.

Any suggestions or tips will be of massive help!

Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
  • 3
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Jul 23 '21 at 17:26
  • Jacob, this is an interesting question, but it is off topic here; try stats.stackexchange.com instead. That said, whether the transformation makes sense depends pretty strongly on what data you have and what goal you are trying to reach. My advice is to say more about that. – Robert Dodier Jul 23 '21 at 21:49

1 Answers1

0

Maybe this is what you are asking. Next time you should make up the data yourself:

ShoeSizes <- c(6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10, 10.5, 11, 11.5, 12)
HatSizes <- c(6, 6.125, 6.25, 6.375, 6.5, 6.625, 6.75, 6.875, 7, 7.125, 
              7.25, 7.375, 7.5, 7.625, 7.75, 7.875, 8)
SockSizes <- c(10, 12, 14)

Clothes.list <- list(Shoes=ShoeSizes, Hats=HatSizes, Socks=SockSizes)
Clothes.df <- stack(Clothes.list)[, 2:1]
colnames(Clothes.df) <- c("Type", "Size")
str(Clothes.df)
# 'data.frame': 33 obs. of  2 variables:
#  $ Type: Factor w/ 3 levels "Shoes","Hats",..: 1 1 1 1 1 1 1 1 1 1 ...
#  $ Size: num  6 6.5 7 7.5 8 8.5 9 9.5 10 10.5 ...
head(Clothes.df); tail(Clothes.df)
#    Type Size
# 1 Shoes  6.0
# 2 Shoes  6.5
# 3 Shoes  7.0
# 4 Shoes  7.5
# 5 Shoes  8.0
# 6 Shoes  8.5
#     Type   Size
# 28  Hats  7.750
# 29  Hats  7.875
# 30  Hats  8.000
# 31 Socks 10.000
# 32 Socks 12.000
# 33 Socks 14.000
dcarlson
  • 10,936
  • 2
  • 15
  • 18