3

So, this is the first time using ggplot to make a histogram and what I see is that for my data the histogram exported is the one below.

enter image description here

What I don't like, is the fact that the first bin does not include the zero and it starts at about 5 or 6. Has anyone encountered something like this?

I have used a range of binwidths, from 1 to 20 and it keeps doing it.

Data:

structure(list(V1 = c(8, 4.4, 9.4, 29.4, 135.6, 65, 70.9, 15.2, 38.8, 87.2, 5.2, 0.2, 7.8, 46.4, 35.9, 77.4, 34.2, 157.4, 46.4, 19, 43.8, 41.2, 96.8, 25.6, 40.2, 111.8, 111.8, 49.8, 39.4, 9.6, 11.6, 8.6, 44.2, 41, 4.6, 36.2, 12.4, 45.8, 0, 30.8, 134.6, 167.2, 13.8, 56.6, 112.3, 13.6, 18.8, 18.2, 7, 40.4, 30.8, 130.2, 234.6, 106.2, 87.2, 15, 7.6, 63, 18, 2.6, 28, 24, 153.2, 24.4, 69.6, 27, 134, 181.6, 46, 85.4, 18.6, 32, 83.6, 42.6, 32.8, 127.5, 92.8, 122, 129.6, 35.4, 20.6, 88, 14.8, 12.8, 33.8, 58.6, 104.2, 0.2)), class = "data.frame", row.names = c(NA, -88L))

Code:

library(ggplot2)
library(scales)
dat = read.csv('mo.csv')
p2=ggplot(dat, aes(x = dat$v2)) +
  geom_histogram(color="black", fill="grey40", bins=25)+ 
  scale_x_continuous(breaks = seq(0, 255, 25), limits = c(0,255), expand=c(0,0))+
  scale_y_continuous(expand = c(0,0),limits = c(0,16.5), breaks = pretty(dat$v2, n = 140))
p2
Roman
  • 4,744
  • 2
  • 16
  • 58
  • 2
    Welcome to StackOverflow. See [how to ask a question](https://stackoverflow.com/help/how-to-ask), and [how to make a great reproducible example](https://stackoverflow.com/a/5963610/2359523) to aid others in answering your questions. Without seeing data, and the code you are using, it's hard to say. You likely don't have data below 5 or 6. – Anonymous coward Nov 30 '18 at 21:45
  • This answer might be helpful: https://stackoverflow.com/a/46453008/8583393 – markus Nov 30 '18 at 21:50
  • Thank you for answering. My whole data set contains only about 100 values. Those below 5 or 6 are these: 0, 0.2, 0.2, 2.6, 4.4, 4.6, 5.2. Binwidth is set to 10 for the image above. –  Nov 30 '18 at 21:55
  • Hello @userwrld_is, please post your code and data into your original question. – Roman Nov 30 '18 at 23:12
  • Hello @Roman. I just updated the question. Hope it helped. –  Nov 30 '18 at 23:27
  • Thanks. I removed the Google Drive link and pasted the data using the `dput()` command. This way others can reproduce it with one line of code. – Roman Nov 30 '18 at 23:39

2 Answers2

2
ggplot(dat, aes(x = dat$V1)) +
  geom_histogram(color="black", fill="grey40", bins=20)+
  scale_x_continuous(breaks = seq(0,250,25))+
  scale_y_continuous(expand = c(0,0),limits = c(0,16.5), breaks = pretty(dat$V1, n = 140))

This seems to be what you're looking for. Although 25 bins looks way better than 20, imo.

Jared C
  • 362
  • 7
  • 19
  • Hello @Jared C. I can't remove `limits = c(0,255)` because otherwise the `breaks = seq(0, 255, 25)` does not work. Thank you. Edit: It works but it goes only up to 225 and I want it to go up too 250. That's why I am putting the limits command. –  Dec 01 '18 at 00:04
  • Try my updated code. It starts at 0 and includes 250 at the end. – Jared C Dec 01 '18 at 00:22
  • Thanks man. That's better. Mine, hadn't been working because `expand = c(0,0)` was after `breaks` –  Dec 01 '18 at 00:24
-1

Using your data and the simplest possible visualization, I get:

1

Are you sure you are using the right variable (column)? Try to run the code below.

By the way: in ggplot2, you do not need to specify the data source in the aes() call. Also, using <- as the assignment operator (not =, contrary to many programming languages) is recommended practice in most cases.

Code

ggplot(data, aes(V1)) +
    geom_histogram(color = "black", fill = "grey40", bins = 25)

Data

data <- structure(list(V1 = c(8, 4.4, 9.4, 29.4, 135.6, 65, 70.9, 15.2, 38.8, 87.2, 5.2, 0.2, 7.8, 46.4, 35.9, 77.4, 34.2, 157.4, 46.4, 19, 43.8, 41.2, 96.8, 25.6, 40.2, 111.8, 111.8, 49.8, 39.4, 9.6, 11.6, 8.6, 44.2, 41, 4.6, 36.2, 12.4, 45.8, 0, 30.8, 134.6, 167.2, 13.8, 56.6, 112.3, 13.6, 18.8, 18.2, 7, 40.4, 30.8, 130.2, 234.6, 106.2, 87.2, 15, 7.6, 63, 18, 2.6, 28, 24, 153.2, 24.4, 69.6, 27, 134, 181.6, 46, 85.4, 18.6, 32, 83.6, 42.6, 32.8, 127.5, 92.8, 122, 129.6, 35.4, 20.6, 88, 14.8, 12.8, 33.8, 58.6, 104.2, 0.2)), class = "data.frame", row.names = c(NA, -88L))
Roman
  • 4,744
  • 2
  • 16
  • 58
  • Thanks for helping. I am using the same data and at first I get the same plot as you. It's also weird how this time, the bin starts before zero. it's just the appearance that changes when I add `limits = c(0,250)`. –  Nov 30 '18 at 23:59