0

I am using the hist-function to analyze some data I generated. For an analysis-assay I would like to precisely control the number of histogram bins.

I know the "break-argument" and I can see that in many cases the number of bins is in a direct relationship to the number of breaks (i.e. no_bins = no_breaks + 1).

Due to R's algorithm this is not always the case. Is there a way to force R to output a specific number of bins?

Let me know if I need to specify further details.

Best and many thanks!

Arne
  • 337
  • 1
  • 6
  • 17
  • Have you looked at [this question](http://stackoverflow.com/questions/16931895/exact-number-of-bins-in-histogram-in-r)? – Iaroslav Domin May 12 '16 at 10:09
  • You can potentially use ggplot2 library it has a geom_histogram where you can control the number of bins using the binwidth argument – ArunK May 12 '16 at 10:11

1 Answers1

1

From ?hist, there are several options for controlling the bins through the breaks argument.

breaks one of:

a vector giving the breakpoints between histogram cells,

a function to compute the vector of breakpoints,

a single number giving the number of cells for the histogram,

a character string naming an algorithm to compute the number of cells (see ‘Details’),

a function to compute the number of cells.

In the last three cases the number is a suggestion only; the breakpoints will be set to pretty values. If breaks is a function, the x vector is supplied to it as the only argument.

For the greatest precision, you have to set the breakpoints exactly, either by supplying a vector of breakpoints, or a function to compute them. You need to cover the entire range of x with your breakpoints and there will be 1 more breakpoint than bins (i.e. no_bins + 1 = no_breaks).

James
  • 65,548
  • 14
  • 155
  • 193