0

I am using R to plot a histogram

i used this function to get the sum of all numbers of the same category

pll <- medical %>%  group_by(numHospStays) %>% summarise(val=sum(numVisits))

The result is a list

enter image description here

I am trying to plot a histogram with the numHospStays column as my x-axis. And the val column as the density of each category in numHospStays.

Sarah john
  • 71
  • 1
  • 7

3 Answers3

2

Would ggplot(pll) + geom_col(aes(x=numHospStays, y=val)) work? This is a bar plot that functions very much like a historgram I think...

Ben Toh
  • 742
  • 5
  • 9
2

You have a discrete value on the x axis, so this is technically a bar chart rather than a histogram. You can use either geom_bar() or geom_col() for this:

medical %>%  
  group_by(numHospStays) %>% 
  summarise(val = sum(numVisits))
  ggplot(aes(x = numHospStays, y = val)) +
  geom_col(fill = "deepskyblue2", color = "black") +
  labs(x = "Number of hospital stays", y = "Count")

enter image description here

or, to emphasize the exponential fall off in number of admissions, try a log scale on the y axis, plus perhaps a fill scale for aesthetic value and a tweak to the overall look using theme_bw:

medical %>%  
  group_by(numHospStays) %>% 
  summarise(val = sum(numVisits))
  ggplot(aes(x = numHospStays, y = val)) +
  geom_col(aes(fill = numHospStays)) +
  scale_fill_gradient(low = "forestgreen", high = "red", guide = guide_none()) +
  labs(x = "Number of hospital stays", y = "Count") +
  scale_y_log10() +
  theme_bw()

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
0

Providing a reproducible example with your data will help us to answer your question.

With some assumptions with your data, you can create a density histogram without the group_by and summarise steps using the code below:

ggplot(data = medical) +
  geom_histogram(aes(x = numHospStays, y = ..density..))
EJJ
  • 1,474
  • 10
  • 17