0

I have a dataset of people's birth-year. I want to plot a histogram, but since I am working with a fairly large dataset I would like to group my data in classes of 5. For example, there are 30 people born in the year 1985 but in my histogram I want it to show me that the frequency is 6.

This is the code I have so far for my histogram.

ggplot(date, aes(date$year)) + 
  geom_histogram(colour = "black") + 
  labs(title = "...", x = "year", y = "frequency")
camille
  • 16,432
  • 18
  • 38
  • 60
Bob Outlook
  • 99
  • 1
  • 6
  • 1
    That does not make sense for a histogram which works with continuous values. Do you mean you want larger bins on a histogram? Or are you plotting a barplot and want bigger intervals? – user2974951 Dec 03 '18 at 13:24
  • 2
    [Don't use `$` inside your `aes` calls](https://stackoverflow.com/q/32543340/5325862) – camille Dec 03 '18 at 17:09

3 Answers3

3

You could just change the labels on the y-axis to reflect the transformation you wish:

ggplot(date, aes(year)) + 
  geom_histogram(colour = "black") + 
  labs(title = "...", x = "year", y = "frequency") + 
  scale_y_continuous(labels=function(x) x/5)

Here's an example with some fake data:

Histogram of the original fake data without transformation:

enter image description here

Exact same data, with the added scale_y_continuous line:

enter image description here

iod
  • 7,412
  • 2
  • 17
  • 36
2

With bar plot:

library(dplyr)
library(ggplot2)

dates_df <- data.frame(year = sample(1950:2018, size = 100000,replace = TRUE)) # randomly generated years

classes <- 5  

dates_df %>% group_by(year) %>% summarise(cnt = n()) %>% 
  ggplot(aes(x= year, y = cnt/classes)) + 
  geom_col(colour = "black") + 
  theme_bw()
emsinko
  • 171
  • 1
  • 6
1

You can also try this:

require(data.table)
library(dplyr)
library(ggplot2)

fake_data <- data.table(name = c('John', 'Peter', 'Alan', 'James', 'Jack', 'Elena', 'Maria'),
                        year = c(2018, 2018, 2018, 2017, 2016, 2017, 2018))

fake_data %>%
group_by(year) %>%
summarize(numb_people = length(unique(name)),
        number_people_freq = length(unique(name))/ 5) %>%
as.data.table() %>%
ggplot(., aes(year)) +
        geom_bar(aes(y = number_people_freq), stat = 'identity') +
        labs(title = "...", x = "year", y = "frequency")]
D Petrova
  • 91
  • 1
  • 6
  • 1
    If you're going with a geom_bar anyway, you don't need to do the transformation in the dataframe. You can just define it inside the aes. – iod Dec 03 '18 at 14:16