0

Suppose I have the following data

set.seed(85)
test <- factor(replicate(10,sample(0:3,5,rep=TRUE)))
set.seed(108)
test2 <- factor(replicate(10,sample(0:3,5,rep=TRUE)))

and the following bargraphs:

barplot(table(test), col=rgb(0,1,0,.5))
barplot(table(test2), col=rgb(1,0,0,.5))

How do I combine these into 1 graph with 1 bargraph superimposed on the other? Something similar to this:

enter image description here

Any help is much appreciated.

Peter
  • 11,500
  • 5
  • 21
  • 31
Icewaffle
  • 443
  • 2
  • 13
  • do you need `barplot(rbind(table(test), table(test2)), beside = TRUE)` – akrun Apr 22 '20 at 21:43
  • or if it is superimpose, check [here](https://stackoverflow.com/questions/34204198/how-to-superimpose-bar-plots-in-r) – akrun Apr 22 '20 at 21:44
  • The graph you show is a histogram not a bar graph. Your tables suggest you what a bar graph rather than a histogram. Ideally you need some 'distribution' data for a histogram. – Peter Apr 22 '20 at 21:55

2 Answers2

2

This combines two bar charts as set out in the question.


library(tibble)
library(dplyr)
library(tidyr)
library(ggplot2)

tib <- 
  tibble(t1 = table(test),
         t2 = table(test2)) %>% 
  mutate(group = 1:4)

ggplot()+
  geom_bar(data = select(tib, -2), aes(group, t1), stat = "identity", fill = "red", alpha = 0.25, width = 1)+
  geom_bar(data = select(tib, -1), aes(group, t2), stat = "identity", fill = "green", alpha = 0.25,width = 1)

Which results in:

enter image description here

However, what I think you might be really looking for is:

library(tibble)
library(dplyr)
library(tidyr)
library(ggplot2)

set.seed(85)
test <- replicate(10,sample(0:3,5,rep=TRUE))

set.seed(108)
test2 <- replicate(10,sample(0:3,5,rep=TRUE))


tib1(tibble)


tib1 <-
  tibble(t1 = c(test),
         t2 = c(test2)) %>% 
  pivot_longer(cols = c(t1, t2), names_to = "test", values_to = "score")

ggplot(tib1, aes(score, fill = test))+
  geom_histogram(bins = 4, alpha = 0.5, position = "identity")


Which gives you:

enter image description here

If the number of observations for each test are different and you want to plot a histogram: create two dataframes and combine them for graphing

set.seed(85)
test1 <- tibble(score = c(replicate(20, sample(0:3, 5, rep = TRUE))),
                test = "t1")

set.seed(108)
test2 <- tibble(score = c(replicate(10, sample(0:3, 5, rep = TRUE))),
                test = "t2")


tib1 <-
  test1 %>% 
  bind_rows(test2)

ggplot(tib1, aes(score, fill = test))+
  geom_histogram(bins = 4, alpha = 0.5, position = "identity")

If you prefer the geom_bar version you can adapt the previous code as follows:

ggplot()+
  geom_bar(data = test1, aes(score), stat = "count", fill = "red", alpha = 0.25, width = 1)+
  geom_bar(data = test2, aes(score), stat = "count", fill = "green", alpha = 0.25,width = 1)

By the way you could probably simplify your code, unless you have other reasons for using replicate as: c(replicate(10, sample(0:3, 5, rep = TRUE))) == sample(0:3, 50, rep = TRUE)

set.seed(108)
s1 <-  c(replicate(10, sample(0:3, 5, rep = TRUE)))

set.seed(108)
s2 <-  sample(0:3, 50, rep = TRUE)

tib <- 
  tibble(t1 = s1,
         t2 = s2) %>% 
  pivot_longer(t1:t2, names_to = "test", values_to = "score")

ggplot(tib, aes(score, fill = test))+
  geom_histogram(bins = 4, alpha = 0.5, position = "identity") +
  facet_wrap(~test)
Peter
  • 11,500
  • 5
  • 21
  • 31
  • Thanks. However, I now realize that my actual data have different lengths (test & test2) and therefore this solution does not work. Do you know how to make tibble solutions work with data of different lengths? – Icewaffle Apr 23 '20 at 20:33
  • 1
    See the latest edits, I've also added a comment about generating the data for the graphs. Which may or may not be helpful! – Peter Apr 24 '20 at 06:07
-2

Hereafter is a working code to obtain approximately what you're looking for. As mentionned in the comments, you should use the hist function rather than the barplot function when ploting distributions.

set.seed(1234)
a <- rnorm(n = 1000, mean = 0, sd = 1)
b <- rnorm(n = 1000, mean = 3, sd = 2)

min_x <- min(c(a, b))
max_x <- max(c(a, b)) # we want the first plot to have the right "size" on the x-axis

#First plot:
hist(a, breaks = 100, 
     col = rgb(red = 0, blue = 0, green = 1, alpha = .5), 
     border = NA, xlim = c(min_x , max_x ))

#Surimposing the second plot:
hist(b, breaks = 100, 
col = rgb(red = 0, blue = 1, green = 0, alpha = .5,), 
border = NA, add = T)

Result: enter image description here

Frédéric
  • 95
  • 3
  • This does not work because the data does not seem to be of the same type as the original example. When I try this solution I get the error "Error in hist.default(test, breaks = 100, col = rgb(red = 0, blue = 0, : 'x' must be numeric" – Icewaffle Apr 23 '20 at 20:30