0

So currently I have 2 separate geom_histograms that I've created and wanting to combine both into one geom_histogram. Essentially the blue graph lines up in parallel with the red graph below so that I could see that some sort of comparison per each x-axis value:

#First histogram on importance of science

```{r}
lab_data_science <- filter(lab_data, lab_data$V202310 >=2)

science_importance_hist <- lab_data_science %>% 
    ggplot() +
    aes(x = V202310) +
    geom_histogram(position = 'dodge', bins = 4, fill='blue') + 
    labs(
        title    = 'Importance of Science in Decisions About COVID-19',
        subtitle = '2 = Little Important, 3 = Moderately Important, 4 = Very Important, 5= Extremely Important', 
        x        = 'Importance Scale (2 to 5)',
        y        = 'Count', 
        fill     = 'Republican'
)
science_importance_hist
```

Histogram#1 Results Photo Below: C:\Users\Austin Jin\Desktop\pic1.PNG

#Second histogram on Disapprovals of Governor Handling COVID-19

```{r}
lab_data_science2 <- filter(lab_data, lab_data$V201147x >= 1)

governor_covid_disapprovals_hist <- lab_data_science %>% 
    ggplot() +
    aes(x = V201147x) +
    geom_histogram(position = 'dodge', bins = 4, fill= 'red') + 
    labs(
        title    = 'Approvals and Disapprovals of Governor Handling COVID-19',
        subtitle = '1 = Approve Strongly, 2 = Approve Not Strongly, 3 = Disapprove Not Strongly, 4= Disapprove Strongly', 
        x        = 'Approval Scale (1 to 4)',
        y        = 'Count', 
        fill     = 'Republican'
  )
governor_covid_disapprovals_hist
```

Histogram#2 Results Photo Below: C:\Users\Austin Jin\Desktop\pic2.PNG

Any insights would be greatly appreciated as I've been struggling to combine both histograms into one histogram for side-by-side comparison purposes. Much thanks in advance and will make sure to nicely reward those who provide an accurate response!

  • I'm not sure I understand the desired result. What exactly do you mean by "combine both histograms into one histogram". What exactly will that look like? It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Jun 22 '21 at 02:59
  • Yeah so exactly like how the following image looks like: https://i.stack.imgur.com/lGuxt.png (Basically being able to have the x-axis values of science_importance_hist side-by-side with the governor_covid_disapprovals_hist x-axis values) – Lawrence Chin Jun 22 '21 at 03:20
  • Well, that's a picture of a bar chart. Not a histogram. Histograms are usually for continuous data while bar plots are for discrete data. How exactly do you want to summarize your data? – MrFlick Jun 22 '21 at 03:23
  • Pretty much I want to have the sums of each value in the science_importance_hist and sums of each value in the governor_covid_disapprovals_hist on the same x-axis of one histogram so that there is a side-by-side comparison to see whether there is some sort of relationship between the 2 histograms – Lawrence Chin Jun 22 '21 at 03:32
  • How does lab_data_science2 enter into it? Did you mean to use it for the 2nd histogram? – Jon Spring Jun 22 '21 at 03:39
  • So both lab_data_science and lab_data_science2 are part of the same dataset but they are unpaired which is why I'm struggling to combine both histograms into one histogram without changing any of the values. So the x-axis values for the first histogram will be plotted and also the x-axis values for the 2nd histogram will be plotted all in one histogram for side-by-side comparison. Both histograms should be using the same y-axis which would be the number of counts – Lawrence Chin Jun 22 '21 at 03:46
  • What do you mean they are "unpaired"? Two side by side histograms would not let you look at the relationship. You can use the patchwork package to combine two separate plots in one display. Do the two questions have the same answer options? You are not providing the needed information. Make a small data set with the two variables, maybe 10 observations. Then we can help. Right now there is no reproducible code because there is no data. Also` fill = Republican` makes no sense unless you also have a variable called Republican so put that in the data too. – Elin Jun 22 '21 at 08:19
  • To make your question into a good question you could take the sample data I created and add it. – Elin Jun 22 '21 at 09:12
  • When following Ellin's approach, I am getting a message that arguments imply differing number of rows: 7265, 8218. Does this mean it is not possible to create the one histogram that I've been looking to achieve for? – Lawrence Chin Jun 22 '21 at 18:12

1 Answers1

0

Although it is hard to know for sure because you don't have reproducible data, I think what you want to do is to pivot_long() from tidyr() and I'd suggest a geom_col. Let's assume you have the counts already calculated, which you could do with dplyr::summarize().

Notice I am creating example data.

covid  <- c(15, 23, 10, 4)
science <- c(12,19, 18, 0)
labels <- c("excellent", "good", "subpar", "decent")
df <-data.frame (labels, covid, science)

df_long <- tidyr::pivot_longer(df, names_to= "Question", values_to = "count", cols = c("science", "covid"))

ggplot(df_long, aes(y = count, x = labels, fill = Question))   + 
             geom_col( position = "dodge2",  width = .5) 

Elin
  • 6,507
  • 3
  • 25
  • 47